Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
1  structure 1  species 0  interactions 1  sequence 1  architecture

Protein: FILA_HUMAN (P20930)

Summary

This is the summary of UniProt entry FILA_HUMAN (P20930).

Description: Filaggrin
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
Length: 4061 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
Pfam S_100 4 48
low_complexity n/a 28 40
low_complexity n/a 48 59
disorder n/a 55 58
disorder n/a 89 3974
low_complexity n/a 99 110
low_complexity n/a 130 143
low_complexity n/a 175 191
coiled_coil n/a 190 210
low_complexity n/a 197 208
Pfam Filaggrin 257 306
low_complexity n/a 260 269
Pfam Filaggrin 373 428
low_complexity n/a 452 468
low_complexity n/a 482 497
low_complexity n/a 517 532
low_complexity n/a 539 551
Pfam Filaggrin 574 630
low_complexity n/a 617 631
Pfam Filaggrin 697 753
low_complexity n/a 760 778
low_complexity n/a 777 792
low_complexity n/a 864 876
low_complexity n/a 876 897
Pfam Filaggrin 899 955
low_complexity n/a 902 920
low_complexity n/a 947 960
low_complexity n/a 978 989
Pfam Filaggrin 1022 1077
low_complexity n/a 1107 1116
low_complexity n/a 1132 1144
low_complexity n/a 1170 1179
low_complexity n/a 1188 1200
Pfam Filaggrin 1223 1279
low_complexity n/a 1226 1238
low_complexity n/a 1271 1287
low_complexity n/a 1329 1340
Pfam Filaggrin 1346 1401
low_complexity n/a 1426 1440
low_complexity n/a 1512 1521
Pfam Filaggrin 1547 1603
low_complexity n/a 1550 1562
low_complexity n/a 1597 1616
low_complexity n/a 1661 1676
Pfam Filaggrin 1670 1725
low_complexity n/a 1752 1764
low_complexity n/a 1835 1848
low_complexity n/a 1851 1867
Pfam Filaggrin 1871 1927
low_complexity n/a 1874 1891
low_complexity n/a 1914 1928
Pfam Filaggrin 1994 2050
low_complexity n/a 2074 2089
low_complexity n/a 2161 2173
low_complexity n/a 2173 2191
Pfam Filaggrin 2196 2252
low_complexity n/a 2199 2215
low_complexity n/a 2230 2244
low_complexity n/a 2244 2262
Pfam Filaggrin 2319 2374
low_complexity n/a 2401 2413
low_complexity n/a 2428 2443
low_complexity n/a 2485 2497
low_complexity n/a 2499 2515
Pfam Filaggrin 2520 2576
low_complexity n/a 2523 2540
low_complexity n/a 2543 2556
low_complexity n/a 2564 2577
Pfam Filaggrin 2643 2698
low_complexity n/a 2708 2724
low_complexity n/a 2801 2821
low_complexity n/a 2821 2842
Pfam Filaggrin 2844 2900
low_complexity n/a 2847 2859
low_complexity n/a 2874 2899
low_complexity n/a 2892 2905
Pfam Filaggrin 2967 3022
low_complexity n/a 3032 3048
low_complexity n/a 3125 3145
low_complexity n/a 3145 3166
Pfam Filaggrin 3168 3224
low_complexity n/a 3198 3223
low_complexity n/a 3216 3229
low_complexity n/a 3247 3258
Pfam Filaggrin 3291 3346
low_complexity n/a 3373 3385
low_complexity n/a 3401 3413
low_complexity n/a 3457 3469
low_complexity n/a 3469 3488
Pfam Filaggrin 3492 3548
low_complexity n/a 3515 3528
low_complexity n/a 3535 3549
Pfam Filaggrin 3615 3670
low_complexity n/a 3697 3709
low_complexity n/a 3781 3793
Pfam Filaggrin 3816 3872
low_complexity n/a 3819 3831
low_complexity n/a 3843 3871
low_complexity n/a 3874 3885
low_complexity n/a 3942 3960
disorder n/a 3985 3996
disorder n/a 4002 4003

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P20930. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MSTLLENIFA IINLFKQYSK KDKNTDTLSK KELKELLEKE FRQILKNPDD
50
51
PDMVDVFMDH LDIDHNKKID FTEFLLMVFK LAQAYYESTR KENLPISGHK
100
101
HRKHSHHDKH EDNKQEENKE NRKRPSSLER RNNRKGNKGR SKSPRETGGK
150
151
RHESSSEKKE RKGYSPTHRE EEYGKNHHNS SKKEKNKTEN TRLGDNRKRL
200
201
SERLEEKEDN EEGVYDYENT GRMTQKWIQS GHIATYYTIQ DEAYDTTDSL
250
251
LEENKIYERS RSSDGKSSSQ VNRSRHENTS QVPLQESRTR KRRGSRVSQD
300
301
RDSEGHSEDS ERHSGSASRN HHGSAWEQSR DGSRHPRSHD EDRASHGHSA
350
351
DSSRQSGTRH AETSSRGQTA SSHEQARSSP GERHGSGHQQ SADSSRHSAT
400
401
GRGQASSAVS DRGHRGSSGS QASDSEGHSE NSDTQSVSGH GKAGLRQQSH
450
451
QESTRGRSGE RSGRSGSSLY QVSTHEQPDS AHGRTGTSTG GRQGSHHEQA
500
501
RDSSRHSASQ EGQDTIRGHP GSSRGGRQGS HHEQSVNRSG HSGSHHSHTT
550
551
SQGRSDASHG QSGSRSASRQ TRNEEQSGDG TRHSGSRHHE ASSQADSSRH
600
601
SQVGQGQSSG PRTSRNQGSS VSQDSDSQGH SEDSERWSGS ASRNHHGSAQ
650
651
EQSRDGSRHP RSHHEDRAGH GHSADSSRKS GTRHTQNSSS GQAASSHEQA
700
701
RSSAGERHGS RHQLQSADSS RHSGTGHGQA SSAVRDSGHR GSSGSQATDS
750
751
EGHSEDSDTQ SVSGHGQAGH HQQSHQESAR DRSGERSRRS GSFLYQVSTH
800
801
KQSESSHGWT GPSTGVRQGS HHEQARDNSR HSASQDGQDT IRGHPGSSRR
850
851
GRQGSHHEQS VDRSGHSGSH HSHTTSQGRS DASRGQSGSR SASRTTRNEE
900
901
QSRDGSRHSG SRHHEASSHA DISRHSQAGQ GQSEGSRTSR RQGSSVSQDS
950
951
DSEGHSEDSE RWSGSASRNH RGSAQEQSRH GSRHPRSHHE DRAGHGHSAD
1000
1001
SSRQSGTPHA ETSSGGQAAS SHEQARSSPG ERHGSRHQQS ADSSRHSGIP
1050
1051
RRQASSAVRD SGHWGSSGSQ ASDSEGHSEE SDTQSVSGHG QDGPHQQSHQ
1100
1101
ESARDWSGGR SGRSGSFIYQ VSTHEQSESA HGRTRTSTGR RQGSHHEQAR
1150
1151
DSSRHSASQE GQDTIRAHPG SRRGGRQGSH HEQSVDRSGH SGSHHSHTTS
1200
1201
QGRSDASHGQ SGSRSASRQT RKDKQSGDGS RHSGSRHHEA ASWADSSRHS
1250
1251
QVGQEQSSGS RTSRHQGSSV SQDSDSERHS DDSERLSGSA SRNHHGSSRE
1300
1301
QSRDGSRHPG FHQEDRASHG HSADSSRQSG THHTESSSHG QAVSSHEQAR
1350
1351
SSPGERHGSR HQQSADSSRH SGIGHRQASS AVRDSGHRGS SGSQVTNSEG
1400
1401
HSEDSDTQSV SAHGQAGPHQ QSHKESARGQ SGESSGRSRS FLYQVSSHEQ
1450
1451
SESTHGQTAP STGGRQGSRH EQARNSSRHS ASQDGQDTIR GHPGSSRGGR
1500
1501
QGSYHEQSVD RSGHSGYHHS HTTPQGRSDA SHGQSGPRSA SRQTRNEEQS
1550
1551
GDGSRHSGSR HHEPSTRAGS SRHSQVGQGE SAGSKTSRRQ GSSVSQDRDS
1600
1601
EGHSEDSERR SESASRNHYG SAREQSRHGS RNPRSHQEDR ASHGHSAESS
1650
1651
RQSGTRHAET SSGGQAASSQ EQARSSPGER HGSRHQQSAD SSTDSGTGRR
1700
1701
QDSSVVGDSG NRGSSGSQAS DSEGHSEESD TQSVSAHGQA GPHQQSHQES
1750
1751
TRGQSGERSG RSGSFLYQVS THEQSESAHG RTGPSTGGRQ RSRHEQARDS
1800
1801
SRHSASQEGQ DTIRGHPGSS RGGRQGSHYE QSVDSSGHSG SHHSHTTSQE
1850
1851
RSDVSRGQSG SRSVSRQTRN EKQSGDGSRH SGSRHHEASS RADSSRHSQV
1900
1901
GQGQSSGPRT SRNQGSSVSQ DSDSQGHSED SERWSGSASR NHLGSAWEQS
1950
1951
RDGSRHPGSH HEDRAGHGHS ADSSRQSGTR HTESSSRGQA ASSHEQARSS
2000
2001
AGERHGSHHQ LQSADSSRHS GIGHGQASSA VRDSGHRGYS GSQASDSEGH
2050
2051
SEDSDTQSVS AQGKAGPHQQ SHKESARGQS GESSGRSGSF LYQVSTHEQS
2100
2101
ESTHGQSAPS TGGRQGSHYD QAQDSSRHSA SQEGQDTIRG HPGPSRGGRQ
2150
2151
GSHQEQSVDR SGHSGSHHSH TTSQGRSDAS RGQSGSRSAS RKTYDKEQSG
2200
2201
DGSRHSGSHH HEASSWADSS RHSLVGQGQS SGPRTSRPRG SSVSQDSDSE
2250
2251
GHSEDSERRS GSASRNHHGS AQEQSRDGSR HPRSHHEDRA GHGHSAESSR
2300
2301
QSGTHHAENS SGGQAASSHE QARSSAGERH GSHHQQSADS SRHSGIGHGQ
2350
2351
ASSAVRDSGH RGSSGSQASD SEGHSEDSDT QSVSAHGQAG PHQQSHQEST
2400
2401
RGRSAGRSGR SGSFLYQVST HEQSESAHGR TGTSTGGRQG SHHKQARDSS
2450
2451
RHSTSQEGQD TIHGHPGSSS GGRQGSHYEQ LVDRSGHSGS HHSHTTSQGR
2500
2501
SDASHGHSGS RSASRQTRND EQSGDGSRHS GSRHHEASSR ADSSGHSQVG
2550
2551
QGQSEGPRTS RNWGSSFSQD SDSQGHSEDS ERWSGSASRN HHGSAQEQLR
2600
2601
DGSRHPRSHQ EDRAGHGHSA DSSRQSGTRH TQTSSGGQAA SSHEQARSSA
2650
2651
GERHGSHHQQ SADSSRHSGI GHGQASSAVR DSGHRGYSGS QASDNEGHSE
2700
2701
DSDTQSVSAH GQAGSHQQSH QESARGRSGE TSGHSGSFLY QVSTHEQSES
2750
2751
SHGWTGPSTR GRQGSRHEQA QDSSRHSASQ DGQDTIRGHP GSSRGGRQGY
2800
2801
HHEHSVDSSG HSGSHHSHTT SQGRSDASRG QSGSRSASRT TRNEEQSGDG
2850
2851
SRHSGSRHHE ASTHADISRH SQAVQGQSEG SRRSRRQGSS VSQDSDSEGH
2900
2901
SEDSERWSGS ASRNHHGSAQ EQLRDGSRHP RSHQEDRAGH GHSADSSRQS
2950
2951
GTRHTQTSSG GQAASSHEQA RSSAGERHGS HHQQSADSSR HSGIGHGQAS
3000
3001
SAVRDSGHRG YSGSQASDNE GHSEDSDTQS VSAHGQAGSH QQSHQESARG
3050
3051
RSGETSGHSG SFLYQVSTHE QSESSHGWTG PSTRGRQGSR HEQAQDSSRH
3100
3101
SASQYGQDTI RGHPGSSRGG RQGYHHEHSV DSSGHSGSHH SHTTSQGRSD
3150
3151
ASRGQSGSRS ASRTTRNEEQ SGDSSRHSVS RHHEASTHAD ISRHSQAVQG
3200
3201
QSEGSRRSRR QGSSVSQDSD SEGHSEDSER WSGSASRNHR GSVQEQSRHG
3250
3251
SRHPRSHHED RAGHGHSADR SRQSGTRHAE TSSGGQAASS HEQARSSPGE
3300
3301
RHGSRHQQSA DSSRHSGIPR GQASSAVRDS RHWGSSGSQA SDSEGHSEES
3350
3351
DTQSVSGHGQ AGPHQQSHQE SARDRSGGRS GRSGSFLYQV STHEQSESAH
3400
3401
GRTRTSTGRR QGSHHEQARD SSRHSASQEG QDTIRGHPGS SRRGRQGSHY
3450
3451
EQSVDRSGHS GSHHSHTTSQ GRSDASRGQS GSRSASRQTR NDEQSGDGSR
3500
3501
HSWSHHHEAS TQADSSRHSQ SGQGQSAGPR TSRNQGSSVS QDSDSQGHSE
3550
3551
DSERWSGSAS RNHRGSAQEQ SRDGSRHPTS HHEDRAGHGH SAESSRQSGT
3600
3601
HHAENSSGGQ AASSHEQARS SAGERHGSHH QQSADSSRHS GIGHGQASSA
3650
3651
VRDSGHRGSS GSQASDSEGH SEDSDTQSVS AHGQAGPHQQ SHQESTRGRS
3700
3701
AGRSGRSGSF LYQVSTHEQS ESAHGRAGPS TGGRQGSRHE QARDSSRHSA
3750
3751
SQEGQDTIRG HPGSRRGGRQ GSYHEQSVDR SGHSGSHHSH TTSQGRSDAS
3800
3801
HGQSGSRSAS RETRNEEQSG DGSRHSGSRH HEASTQADSS RHSQSGQGES
3850
3851
AGSRRSRRQG SSVSQDSDSE AYPEDSERRS ESASRNHHGS SREQSRDGSR
3900
3901
HPGSSHRDTA SHVQSSPVQS DSSTAKEHGH FSSLSQDSAY HSGIQSRGSP
3950
3951
HSSSSYHYQS EGTERQKGQS GLVWRHGSYG SADYDYGESG FRHSQHGSVS
4000
4001
YNSNPVVFKE RSDICKASAF GKDHPRYYAT YINKDPGLCG HSSDISKQLG
4050
4051
FSQSQRYYYY E                                          
4061
 

Show the unformatted sequence.

Checksums:
CRC64:3F4B1181F04AD9C0
MD5:3c7f6de9e68ac2b91680807c8c912dfe

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe SIFTS project, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
S_100 4 - 48 4PCW A 3 - 47 NGL View in InterPro
B 3 - 47 NGL View in InterPro
C 3 - 47 NGL View in InterPro
D 3 - 47 NGL View in InterPro
×

The parts of the structure corresponding to the Pfam family are highlighted in yellow.

Loading Structure Data

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.