Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
0  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: CO7A1_HUMAN (Q02388)

Summary

This is the summary of UniProt entry CO7A1_HUMAN (Q02388).

Description: Collagen alpha-1(VII) chain
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
View Pfam proteome data.
Length: 2944 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
sig_p n/a 1 23
Pfam VWA 38 209
disorder n/a 219 240
Pfam fn3 233 318
disorder n/a 275 291
disorder n/a 321 323
disorder n/a 325 332
Pfam fn3 331 407
disorder n/a 367 369
disorder n/a 371 381
disorder n/a 410 412
Pfam fn3 419 492
disorder n/a 448 449
disorder n/a 465 466
low_complexity n/a 477 488
disorder n/a 496 514
Pfam fn3 509 587
disorder n/a 584 594
Pfam fn3 599 677
disorder n/a 628 656
disorder n/a 663 667
disorder n/a 676 682
Pfam fn3 687 765
disorder n/a 689 694
disorder n/a 715 739
disorder n/a 741 744
disorder n/a 751 756
disorder n/a 762 772
Pfam fn3 777 855
disorder n/a 816 828
disorder n/a 853 864
Pfam fn3 868 946
disorder n/a 869 873
disorder n/a 898 899
disorder n/a 907 918
disorder n/a 935 936
disorder n/a 941 957
disorder n/a 959 960
disorder n/a 996 1013
disorder n/a 1038 1040
disorder n/a 1042 1044
Pfam VWA 1054 1227
disorder n/a 1067 1069
disorder n/a 1198 1218
disorder n/a 1225 1239
disorder n/a 1241 2797
Pfam Collagen 1251 1310
low_complexity n/a 1276 1289
Pfam Collagen 1296 1361
low_complexity n/a 1318 1334
low_complexity n/a 1325 1352
low_complexity n/a 1348 1367
low_complexity n/a 1363 1394
low_complexity n/a 1398 1452
Pfam Collagen 1449 1504
low_complexity n/a 1452 1473
Pfam Collagen 1489 1548
low_complexity n/a 1511 1531
low_complexity n/a 1573 1595
low_complexity n/a 1609 1637
low_complexity n/a 1630 1651
low_complexity n/a 1690 1705
low_complexity n/a 1784 1802
low_complexity n/a 1814 1838
low_complexity n/a 1877 1918
low_complexity n/a 2005 2017
Pfam Collagen 2034 2099
low_complexity n/a 2038 2075
low_complexity n/a 2075 2091
Pfam Collagen 2099 2158
low_complexity n/a 2164 2204
low_complexity n/a 2203 2223
low_complexity n/a 2217 2250
Pfam Collagen 2257 2321
low_complexity n/a 2271 2293
low_complexity n/a 2303 2319
Pfam Collagen 2317 2374
low_complexity n/a 2334 2350
low_complexity n/a 2368 2406
Pfam Collagen 2373 2430
Pfam Collagen 2407 2465
low_complexity n/a 2427 2454
low_complexity n/a 2462 2486
Pfam Collagen 2466 2524
Pfam Collagen 2526 2584
Pfam Collagen 2563 2639
low_complexity n/a 2571 2601
Pfam Collagen 2617 2694
low_complexity n/a 2711 2736
Pfam Collagen 2722 2787
low_complexity n/a 2744 2766
low_complexity n/a 2758 2783
disorder n/a 2804 2807
disorder n/a 2813 2876
low_complexity n/a 2839 2850
low_complexity n/a 2847 2864
Pfam Kunitz_BPTI 2875 2930
disorder n/a 2930 2944

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession Q02388. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MTLRLLVAAL CAGILAEAPR VRAQHRERVT CTRLYAADIV FLLDGSSSIG
50
51
RSNFREVRSF LEGLVLPFSG AASAQGVRFA TVQYSDDPRT EFGLDALGSG
100
101
GDVIRAIREL SYKGGNTRTG AAILHVADHV FLPQLARPGV PKVCILITDG
150
151
KSQDLVDTAA QRLKGQGVKL FAVGIKNADP EELKRVASQP TSDFFFFVND
200
201
FSILRTLLPL VSRRVCTTAG GVPVTRPPDD STSAPRDLVL SEPSSQSLRV
250
251
QWTAASGPVT GYKVQYTPLT GLGQPLPSER QEVNVPAGET SVRLRGLRPL
300
301
TEYQVTVIAL YANSIGEAVS GTARTTALEG PELTIQNTTA HSLLVAWRSV
350
351
PGATGYRVTW RVLSGGPTQQ QELGPGQGSV LLRDLEPGTD YEVTVSTLFG
400
401
RSVGPATSLM ARTDASVEQT LRPVILGPTS ILLSWNLVPE ARGYRLEWRR
450
451
ETGLEPPQKV VLPSDVTRYQ LDGLQPGTEY RLTLYTLLEG HEVATPATVV
500
501
PTGPELPVSP VTDLQATELP GQRVRVSWSP VPGATQYRII VRSTQGVERT
550
551
LVLPGSQTAF DLDDVQAGLS YTVRVSARVG PREGSASVLT VRREPETPLA
600
601
VPGLRVVVSD ATRVRVAWGP VPGASGFRIS WSTGSGPESS QTLPPDSTAT
650
651
DITGLQPGTT YQVAVSVLRG REEGPAAVIV ARTDPLGPVR TVHVTQASSS
700
701
SVTITWTRVP GATGYRVSWH SAHGPEKSQL VSGEATVAEL DGLEPDTEYT
750
751
VHVRAHVAGV DGPPASVVVR TAPEPVGRVS RLQILNASSD VLRITWVGVT
800
801
GATAYRLAWG RSEGGPMRHQ ILPGNTDSAE IRGLEGGVSY SVRVTALVGD
850
851
REGTPVSIVV TTPPEAPPAL GTLHVVQRGE HSLRLRWEPV PRAQGFLLHW
900
901
QPEGGQEQSR VLGPELSSYH LDGLEPATQY RVRLSVLGPA GEGPSAEVTA
950
951
RTESPRVPSI ELRVVDTSID SVTLAWTPVS RASSYILSWR PLRGPGQEVP
1000
1001
GSPQTLPGIS SSQRVTGLEP GVSYIFSLTP VLDGVRGPEA SVTQTPVCPR
1050
1051
GLADVVFLPH ATQDNAHRAE ATRRVLERLV LALGPLGPQA VQVGLLSYSH
1100
1101
RPSPLFPLNG SHDLGIILQR IRDMPYMDPS GNNLGTAVVT AHRYMLAPDA
1150
1151
PGRRQHVPGV MVLLVDEPLR GDIFSPIREA QASGLNVVML GMAGADPEQL
1200
1201
RRLAPGMDSV QTFFAVDDGP SLDQAVSGLA TALCQASFTT QPRPEPCPVY
1250
1251
CPKGQKGEPG EMGLRGQVGP PGDPGLPGRT GAPGPQGPPG SATAKGERGF
1300
1301
PGADGRPGSP GRAGNPGTPG APGLKGSPGL PGPRGDPGER GPRGPKGEPG
1350
1351
APGQVIGGEG PGLPGRKGDP GPSGPPGPRG PLGDPGPRGP PGLPGTAMKG
1400
1401
DKGDRGERGP PGPGEGGIAP GEPGLPGLPG SPGPQGPVGP PGKKGEKGDS
1450
1451
EDGAPGLPGQ PGSPGEQGPR GPPGAIGPKG DRGFPGPLGE AGEKGERGPP
1500
1501
GPAGSRGLPG VAGRPGAKGP EGPPGPTGRQ GEKGEPGRPG DPAVVGPAVA
1550
1551
GPKGEKGDVG PAGPRGATGV QGERGPPGLV LPGDPGPKGD PGDRGPIGLT
1600
1601
GRAGPPGDSG PPGEKGDPGR PGPPGPVGPR GRDGEVGEKG DEGPPGDPGL
1650
1651
PGKAGERGLR GAPGVRGPVG EKGDQGDPGE DGRNGSPGSS GPKGDRGEPG
1700
1701
PPGPPGRLVD TGPGAREKGE PGDRGQEGPR GPKGDPGLPG APGERGIEGF
1750
1751
RGPPGPQGDP GVRGPAGEKG DRGPPGLDGR SGLDGKPGAA GPSGPNGAAG
1800
1801
KAGDPGRDGL PGLRGEQGLP GPSGPPGLPG KPGEDGKPGL NGKNGEPGDP
1850
1851
GEDGRKGEKG DSGASGREGR DGPKGERGAP GILGPQGPPG LPGPVGPPGQ
1900
1901
GFPGVPGGTG PKGDRGETGS KGEQGLPGER GLRGEPGSVP NVDRLLETAG
1950
1951
IKASALREIV ETWDESSGSF LPVPERRRGP KGDSGEQGPP GKEGPIGFPG
2000
2001
ERGLKGDRGD PGPQGPPGLA LGERGPPGPS GLAGEPGKPG IPGLPGRAGG
2050
2051
VGEAGRPGER GERGEKGERG EQGRDGPPGL PGTPGPPGPP GPKVSVDEPG
2100
2101
PGLSGEQGPP GLKGAKGEPG SNGDQGPKGD RGVPGIKGDR GEPGPRGQDG
2150
2151
NPGLPGERGM AGPEGKPGLQ GPRGPPGPVG GHGDPGPPGA PGLAGPAGPQ
2200
2201
GPSGLKGEPG ETGPPGRGLT GPTGAVGLPG PPGPSGLVGP QGSPGLPGQV
2250
2251
GETGKPGAPG RDGASGKDGD RGSPGVPGSP GLPGPVGPKG EPGPTGAPGQ
2300
2301
AVVGLPGAKG EKGAPGGLAG DLVGEPGAKG DRGLPGPRGE KGEAGRAGEP
2350
2351
GDPGEDGQKG APGPKGFKGD PGVGVPGSPG PPGPPGVKGD LGLPGLPGAP
2400
2401
GVVGFPGQTG PRGEMGQPGP SGERGLAGPP GREGIPGPLG PPGPPGSVGP
2450
2451
PGASGLKGDK GDPGVGLPGP RGERGEPGIR GEDGRPGQEG PRGLTGPPGS
2500
2501
RGERGEKGDV GSAGLKGDKG DSAVILGPPG PRGAKGDMGE RGPRGLDGDK
2550
2551
GPRGDNGDPG DKGSKGEPGD KGSAGLPGLR GLLGPQGQPG AAGIPGDPGS
2600
2601
PGKDGVPGIR GEKGDVGFMG PRGLKGERGV KGACGLDGEK GDKGEAGPPG
2650
2651
RPGLAGHKGE MGEPGVPGQS GAPGKEGLIG PKGDRGFDGQ PGPKGDQGEK
2700
2701
GERGTPGIGG FPGPSGNDGS AGPPGPPGSV GPRGPEGLQG QKGERGPPGE
2750
2751
RVVGAPGVPG APGERGEQGR PGPAGPRGEK GEAALTEDDI RGFVRQEMSQ
2800
2801
HCACQGQFIA SGSRPLPSYA ADTAGSQLHA VPVLRVSHAE EEERVPPEDD
2850
2851
EYSEYSEYSV EEYQDPEAPW DSDDPCSLPL DEGSCTAYTL RWYHRAVTGS
2900
2901
TEACHPFVYG GCGGNANRFG TREACERRCP PRVVQSQGTG TAQD      
2944
 

Show the unformatted sequence.

Checksums:
CRC64:96D8BF6D0FD387DB
MD5:5fee08b99008bd3ad21b01cd193d49a7

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.