Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
6  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: CO2A1_HUMAN (P02458)

Summary

This is the summary of UniProt entry CO2A1_HUMAN (P02458).

Description: Collagen alpha-1(II) chain {ECO:0000305}
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
Length: 1487 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
sig_p n/a 1 25
low_complexity n/a 9 21
Pfam VWC 34 89
disorder n/a 96 1286
low_complexity n/a 98 110
Pfam Collagen 116 182
low_complexity n/a 117 135
low_complexity n/a 153 177
low_complexity n/a 197 241
Pfam Collagen 199 260
Pfam Collagen 237 317
low_complexity n/a 237 271
low_complexity n/a 309 325
low_complexity n/a 330 348
low_complexity n/a 345 382
low_complexity n/a 390 412
low_complexity n/a 416 451
Pfam Collagen 429 498
low_complexity n/a 464 510
low_complexity n/a 579 600
low_complexity n/a 621 633
low_complexity n/a 636 661
low_complexity n/a 659 687
low_complexity n/a 696 711
low_complexity n/a 729 753
low_complexity n/a 783 804
Pfam Collagen 792 860
low_complexity n/a 804 834
low_complexity n/a 825 849
low_complexity n/a 854 879
low_complexity n/a 870 885
low_complexity n/a 888 903
low_complexity n/a 906 930
low_complexity n/a 948 966
low_complexity n/a 1017 1032
low_complexity n/a 1038 1053
low_complexity n/a 1062 1081
low_complexity n/a 1137 1176
Pfam Collagen 1158 1217
low_complexity n/a 1179 1192
low_complexity n/a 1197 1217
Pfam COLFI 1251 1486
low_complexity n/a 1333 1345
disorder n/a 1335 1336

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P02458. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MIRLGAPQTL VLLTLLVAAV LRCQGQDVQE AGSCVQDGQR YNDKDVWKPE
50
51
PCRICVCDTG TVLCDDIICE DVKDCLSPEI PFGECCPICP TDLATASGQP
100
101
GPKGQKGEPG DIKDIVGPKG PPGPQGPAGE QGPRGDRGDK GEKGAPGPRG
150
151
RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA AQMAGGFDEK AGGAQLGVMQ
200
201
GPMGPMGPRG PPGPAGAPGP QGFQGNPGEP GEPGVSGPMG PRGPPGPPGK
250
251
PGDDGEAGKP GKAGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDGAKG
300
301
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP
350
351
GPAGPPGPVG PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS
400
401
PGPAGASGNP GTDGIPGAKG SAGAPGIAGA PGFPGPRGPP GPQGATGPLG
450
451
PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG APGPAGEEGK RGARGEPGGV
500
501
GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG PKGANGDPGR
550
551
PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG
600
601
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA
650
651
GERGEQGAPG PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE
700
701
RGFPGERGSP GAQGLQGPRG LPGTPGTDGP KGASGPAGPP GAQGPPGLQG
750
751
MPGERGAAGI AGPKGDRGDV GEKGPEGAPG KDGGRGLTGP IGPPGPAGAN
800
801
GEKGEVGPPG PAGSAGARGA PGERGETGPP GPAGFAGPPG ADGQPGAKGE
850
851
QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG
900
901
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA
950
951
GPPGEKGEPG DDGPSGAEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP
1000
1001
SGEPGKQGAP GASGDRGPPG PVGPPGLTGP AGEPGREGSP GADGPPGRDG
1050
1051
AAGVKGDRGE TGAVGAPGAP GPPGSPGPAG PTGKQGDRGE AGAQGPMGPS
1100
1101
GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG LPGPPGPSGD
1150
1151
QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG
1200
1201
PPGNPGPPGP PGPPGPGIDM SAFAGLGPRE KGPDPLQYMR ADQAAGGLRQ
1250
1251
HDAEVDATLK SLNNQIESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI
1300
1301
DPNQGCTLDA MKVFCNMETG ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG
1350
1351
ETINGGFHFS YGDDNLAPNT ANVQMTFLRL LSTEGSQNIT YHCKNSIAYL
1400
1401
DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK HTGKWGKTVI
1450
1451
EYRSQKTSRL PIIDIAPMDI GGPEQEFGVD IGPVCFL              
1487
 

Show the unformatted sequence.

Checksums:
CRC64:A8312503825BF0BB
MD5:79890c4bfb30697f52dbe88409cce3e0

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe SIFTS project, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
Collagen 461 - 473 6NIX C 1 - 13 NGL View in InterPro
461 - 474 2FSE E 999 - 1012 NGL View in InterPro
F 1999 - 2012 NGL View in InterPro
486 - 495 5OCX A 4 - 13 NGL View in InterPro
821 - 832 5MV4 E 21 - 32 NGL View in InterPro
N 21 - 32 NGL View in InterPro
Q 21 - 32 NGL View in InterPro
T 21 - 32 NGL View in InterPro
W 21 - 32 NGL View in InterPro
X 21 - 32 NGL View in InterPro
821 - 833 5MV4 H 21 - 33 NGL View in InterPro
K 21 - 33 NGL View in InterPro
VWC 34 - 89 1U5M A 10 - 65 NGL View in InterPro
5NIR A 34 - 89 NGL View in InterPro
B 34 - 89 NGL View in InterPro
×

The parts of the structure corresponding to the Pfam family are highlighted in yellow.

Loading Structure Data

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.