Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
6  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: HD_HUMAN (P42858)

Summary

This is the summary of UniProt entry HD_HUMAN (P42858).

Description: Huntingtin
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
Length: 3142 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
disorder n/a 1 98
coiled_coil n/a 14 38
low_complexity n/a 18 76
transmembrane n/a 286 302
low_complexity n/a 291 302
transmembrane n/a 308 325
low_complexity n/a 309 325
disorder n/a 403 404
disorder n/a 406 412
low_complexity n/a 411 419
disorder n/a 415 419
disorder n/a 437 439
disorder n/a 442 622
low_complexity n/a 452 464
low_complexity n/a 531 545
disorder n/a 628 635
disorder n/a 641 672
transmembrane n/a 1062 1081
disorder n/a 1109 1114
disorder n/a 1116 1117
disorder n/a 1119 1123
disorder n/a 1166 1167
low_complexity n/a 1169 1180
disorder n/a 1176 1219
disorder n/a 1336 1337
disorder n/a 1339 1343
coiled_coil n/a 1441 1461
low_complexity n/a 1442 1460
Pfam DUF3652 1513 1553
disorder n/a 1722 1723
disorder n/a 1725 1728
low_complexity n/a 1832 1842
disorder n/a 1865 1868
low_complexity n/a 2070 2082
disorder n/a 2073 2088
disorder n/a 2330 2348
disorder n/a 2350 2351
disorder n/a 2483 2497
low_complexity n/a 2483 2494
low_complexity n/a 2633 2656
disorder n/a 2634 2662
low_complexity n/a 2879 2894
disorder n/a 2933 2947
transmembrane n/a 3051 3073

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P42858. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MATLEKLMKA FESLKSFQQQ QQQQQQQQQQ QQQQQQQQPP PPPPPPPPPQ
50
51
LPQPPPQAQP LLPQPQPPPP PPPPPPGPAV AEEPLHRPKK ELSATKKDRV
100
101
NHCLTICENI VAQSVRNSPE FQKLLGIAME LFLLCSDDAE SDVRMVADEC
150
151
LNKVIKALMD SNLPRLQLEL YKEIKKNGAP RSLRAALWRF AELAHLVRPQ
200
201
KCRPYLVNLL PCLTRTSKRP EESVQETLAA AVPKIMASFG NFANDNEIKV
250
251
LLKAFIANLK SSSPTIRRTA AGSAVSICQH SRRTQYFYSW LLNVLLGLLV
300
301
PVEDEHSTLL ILGVLLTLRY LVPLLQQQVK DTSLKGSFGV TRKEMEVSPS
350
351
AEQLVQVYEL TLHHTQHQDH NVVTGALELL QQLFRTPPPE LLQTLTAVGG
400
401
IGQLTAAKEE SGGRSRSGSI VELIAGGGSS CSPVLSRKQK GKVLLGEEEA
450
451
LEDDSESRSD VSSSALTASV KDEISGELAA SSGVSTPGSA GHDIITEQPR
500
501
SQHTLQADSV DLASCDLTSS ATDGDEEDIL SHSSSQVSAV PSDPAMDLND
550
551
GTQASSPISD SSQTTTEGPD SAVTPSDSSE IVLDGTDNQY LGLQIGQPQD
600
601
EDEEATGILP DEASEAFRNS SMALQQAHLL KNMSHCRQPS DSSVDKFVLR
650
651
DEATEPGDQE NKPCRIKGDI GQSTDDDSAP LVHCVRLLSA SFLLTGGKNV
700
701
LVPDRDVRVS VKALALSCVG AAVALHPESF FSKLYKVPLD TTEYPEEQYV
750
751
SDILNYIDHG DPQVRGATAI LCGTLICSIL SRSRFHVGDW MGTIRTLTGN
800
801
TFSLADCIPL LRKTLKDESS VTCKLACTAV RNCVMSLCSS SYSELGLQLI
850
851
IDVLTLRNSS YWLVRTELLE TLAEIDFRLV SFLEAKAENL HRGAHHYTGL
900
901
LKLQERVLNN VVIHLLGDED PRVRHVAAAS LIRLVPKLFY KCDQGQADPV
950
951
VAVARDQSSV YLKLLMHETQ PPSHFSVSTI TRIYRGYNLL PSITDVTMEN
1000
1001
NLSRVIAAVS HELITSTTRA LTFGCCEALC LLSTAFPVCI WSLGWHCGVP
1050
1051
PLSASDESRK SCTVGMATMI LTLLSSAWFP LDLSAHQDAL ILAGNLLAAS
1100
1101
APKSLRSSWA SEEEANPAAT KQEEVWPALG DRALVPMVEQ LFSHLLKVIN
1150
1151
ICAHVLDDVA PGPAIKAALP SLTNPPSLSP IRRKGKEKEP GEQASVPLSP
1200
1201
KKGSEASAAS RQSDTSGPVT TSKSSSLGSF YHLPSYLKLH DVLKATHANY
1250
1251
KVTLDLQNST EKFGGFLRSA LDVLSQILEL ATLQDIGKCV EEILGYLKSC
1300
1301
FSREPMMATV CVQQLLKTLF GTNLASQFDG LSSNPSKSQG RAQRLGSSSV
1350
1351
RPGLYHYCFM APYTHFTQAL ADASLRNMVQ AEQENDTSGW FDVLQKVSTQ
1400
1401
LKTNLTSVTK NRADKNAIHN HIRLFEPLVI KALKQYTTTT CVQLQKQVLD
1450
1451
LLAQLVQLRV NYCLLDSDQV FIGFVLKQFE YIEVGQFRES EAIIPNIFFF
1500
1501
LVLLSYERYH SKQIIGIPKI IQLCDGIMAS GRKAVTHAIP ALQPIVHDLF
1550
1551
VLRGTNKADA GKELETQKEV VVSMLLRLIQ YHQVLEMFIL VLQQCHKENE
1600
1601
DKWKRLSRQI ADIILPMLAK QQMHIDSHEA LGVLNTLFEI LAPSSLRPVD
1650
1651
MLLRSMFVTP NTMASVSTVQ LWISGILAIL RVLISQSTED IVLSRIQELS
1700
1701
FSPYLISCTV INRLRDGDST STLEEHSEGK QIKNLPEETF SRFLLQLVGI
1750
1751
LLEDIVTKQL KVEMSEQQHT FYCQELGTLL MCLIHIFKSG MFRRITAAAT
1800
1801
RLFRSDGCGG SFYTLDSLNL RARSMITTHP ALVLLWCQIL LLVNHTDYRW
1850
1851
WAEVQQTPKR HSLSSTKLLS PQMSGEEEDS DLAAKLGMCN REIVRRGALI
1900
1901
LFCDYVCQNL HDSEHLTWLI VNHIQDLISL SHEPPVQDFI SAVHRNSAAS
1950
1951
GLFIQAIQSR CENLSTPTML KKTLQCLEGI HLSQSGAVLT LYVDRLLCTP
2000
2001
FRVLARMVDI LACRRVEMLL AANLQSSMAQ LPMEELNRIQ EYLQSSGLAQ
2050
2051
RHQRLYSLLD RFRLSTMQDS LSPSPPVSSH PLDGDGHVSL ETVSPDKDWY
2100
2101
VHLVKSQCWT RSDSALLEGA ELVNRIPAED MNAFMMNSEF NLSLLAPCLS
2150
2151
LGMSEISGGQ KSALFEAARE VTLARVSGTV QQLPAVHHVF QPELPAEPAA
2200
2201
YWSKLNDLFG DAALYQSLPT LARALAQYLV VVSKLPSHLH LPPEKEKDIV
2250
2251
KFVVATLEAL SWHLIHEQIP LSLDLQAGLD CCCLALQLPG LWSVVSSTEF
2300
2301
VTHACSLIYC VHFILEAVAV QPGEQLLSPE RRTNTPKAIS EEEEEVDPNT
2350
2351
QNPKYITAAC EMVAEMVESL QSVLALGHKR NSGVPAFLTP LLRNIIISLA
2400
2401
RLPLVNSYTR VPPLVWKLGW SPKPGGDFGT AFPEIPVEFL QEKEVFKEFI
2450
2451
YRINTLGWTS RTQFEETWAT LLGVLVTQPL VMEQEESPPE EDTERTQINV
2500
2501
LAVQAITSLV LSAMTVPVAG NPAVSCLEQQ PRNKPLKALD TRFGRKLSII
2550
2551
RGIVEQEIQA MVSKRENIAT HHLYQAWDPV PSLSPATTGA LISHEKLLLQ
2600
2601
INPERELGSM SYKLGQVSIH SVWLGNSITP LREEEWDEEE EEEADAPAPS
2650
2651
SPPTSPVNSR KHRAGVDIHS CSQFLLELYS RWILPSSSAR RTPAILISEV
2700
2701
VRSLLVVSDL FTERNQFELM YVTLTELRRV HPSEDEILAQ YLVPATCKAA
2750
2751
AVLGMDKAVA EPVSRLLEST LRSSHLPSRV GALHGVLYVL ECDLLDDTAK
2800
2801
QLIPVISDYL LSNLKGIAHC VNIHSQQHVL VMCATAFYLI ENYPLDVGPE
2850
2851
FSASIIQMCG VMLSGSEEST PSIIYHCALR GLERLLLSEQ LSRLDAESLV
2900
2901
KLSVDRVNVH SPHRAMAALG LMLTCMYTGK EKVSPGRTSD PNPAAPDSES
2950
2951
VIVAMERVSV LFDRIRKGFP CEARVVARIL PQFLDDFFPP QDIMNKVIGE
3000
3001
FLSNQQPYPQ FMATVVYKVF QTLHSTGQSS MVRDWVMLSL SNFTQRAPVA
3050
3051
MATWSLSCFF VSASTSPWVA AILPHVISRM GKLEQVDVNL FCLVATDFYR
3100
3101
HQIEEELDRR AFQSVLEVVA APGSPYHRLL TCLRNVHKVT TC        
3142
 

Show the unformatted sequence.

Checksums:
CRC64:A267509E84D52F0D
MD5:8df3a69fce2ce3cdf8eb1ce4935f3deb

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe SIFTS project, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
DUF3652 1513 - 1553 6X9O A 1515 - 1555 Show 3D Structure View in InterPro
1515 - 1552 6EZ8 A 1511 - 1548 Show 3D Structure View in InterPro
6RMH A 1511 - 1548 Show 3D Structure View in InterPro
6YEJ A 1566 - 1603 Show 3D Structure View in InterPro
7DXJ A 1511 - 1548 Show 3D Structure View in InterPro
7DXK A 1511 - 1548 Show 3D Structure View in InterPro
×

The parts of the structure corresponding to the Pfam family are highlighted in blue.

Loading Structure Data

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.