Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
0  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: CO1A1_MOUSE (P11087)

Summary

This is the summary of UniProt entry CO1A1_MOUSE (P11087).

Description: Collagen alpha-1(I) chain
Source organism: Mus musculus (Mouse) (NCBI taxonomy ID 10090)
View Pfam proteome data.
Length: 1453 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
sig_p n/a 1 22
low_complexity n/a 6 19
Pfam VWC 31 86
disorder n/a 94 1248
Pfam Collagen 97 154
low_complexity n/a 101 125
low_complexity n/a 124 146
Pfam Collagen 166 227
low_complexity n/a 166 215
low_complexity n/a 221 245
Pfam Collagen 225 284
low_complexity n/a 260 284
Pfam Collagen 285 344
low_complexity n/a 290 311
low_complexity n/a 314 339
low_complexity n/a 353 384
low_complexity n/a 380 408
low_complexity n/a 400 425
low_complexity n/a 446 465
low_complexity n/a 461 485
low_complexity n/a 494 515
low_complexity n/a 526 548
low_complexity n/a 542 566
low_complexity n/a 589 599
low_complexity n/a 602 626
low_complexity n/a 628 650
low_complexity n/a 677 698
low_complexity n/a 701 716
low_complexity n/a 752 774
Pfam Collagen 768 827
low_complexity n/a 770 791
low_complexity n/a 782 815
low_complexity n/a 824 849
Pfam Collagen 828 887
low_complexity n/a 845 869
low_complexity n/a 863 878
low_complexity n/a 872 905
low_complexity n/a 902 927
low_complexity n/a 923 941
low_complexity n/a 974 1001
low_complexity n/a 994 1014
Pfam Collagen 1008 1077
low_complexity n/a 1028 1049
low_complexity n/a 1058 1082
Pfam Collagen 1068 1127
low_complexity n/a 1100 1139
Pfam Collagen 1122 1184
low_complexity n/a 1141 1164
low_complexity n/a 1163 1183
low_complexity n/a 1205 1216
Pfam COLFI 1216 1452
disorder n/a 1299 1302

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P11087. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MFSFVDLRLL LLLGATALLT HGQEDIPEVS CIHNGLRVPN GETWKPEVCL
50
51
ICICHNGTAV CDDVQCNEEL DCPNPQRREG ECCAFCPEEY VSPNSEDVGV
100
101
EGPKGDPGPQ GPRGPVGPPG RDGIPGQPGL PGPPGPPGPP GPPGLGGNFA
150
151
SQMSYGYDEK SAGVSVPGPM GPSGPRGLPG PPGAPGPQGF QGPPGEPGEP
200
201
GGSGPMGPRG PPGPPGKNGD DGEAGKPGRP GERGPPGPQG ARGLPGTAGL
250
251
PGMKGHRGFS GLDGAKGDAG PAGPKGEPGS PGENGAPGQM GPRGLPGERG
300
301
RPGPPGTAGA RGNDGAVGAA GPPGPTGPTG PPGFPGAVGA KGEAGPQGAR
350
351
GSEGPQGVRG EPGPPGPAGA AGPAGNPGAD GQPGAKGANG APGIAGAPGF
400
401
PGARGPSGPQ GPSGPPGPKG NSGEPGAPGN KGDTGAKGEP GATGVQGPPG
450
451
PAGEEGKRGA RGEPGPSGLP GPPGERGGPG SRGFPGADGV AGPKGPSGER
500
501
GAPGPAGPKG SPGEAGRPGE AGLPGAKGLT GSPGSPGPDG KTGPPGPAGQ
550
551
DGRPGPAGPP GARGQAGVMG FPGPKGTAGE PGKAGERGLP GPPGAVGPAG
600
601
KDGEAGAQGA PGPAGPAGER GEQGPAGSPG FQGLPGPAGP PGEAGKPGEQ
650
651
GVPGDLGAPG PSGARGERGF PGERGVQGPP GPAGPRGNNG APGNDGAKGD
700
701
TGAPGAPGSQ GAPGLQGMPG ERGAAGLPGP KGDRGDAGPK GADGSPGKDG
750
751
ARGLTGPIGP PGPAGAPGDK GEAGPSGPPG PTGARGAPGD RGEAGPPGPA
800
801
GFAGPPGADG QPGAKGEPGD TGVKGDAGPP GPAGPAGPPG PIGNVGAPGP
850
851
KGPRGAAGPP GATGFPGAAG RVGPPGPSGN AGPPGPPGPV GKEGGKGPRG
900
901
ETGPAGRPGE VGPPGPPGPA GEKGSPGADG PAGSPGTPGP QGIAGQRGVV
950
951
GLPGQRGERG FPGLPGPSGE PGKQGPSGSS GERGPPGPMG PPGLAGPPGE
1000
1001
SGREGSPGAE GSPGRDGAPG AKGDRGETGP AGPPGAPGAP GAPGPVGPAG
1050
1051
KNGDRGETGP AGPAGPIGPA GARGPAGPQG PRGDKGETGE QGDRGIKGHR
1100
1101
GFSGLQGPPG SPGSPGEQGP SGASGPAGPR GPPGSAGSPG KDGLNGLPGP
1150
1151
IGPPGPRGRT GDSGPAGPPG PPGPPGPPGP PSGGYDFSFL PQPPQEKSQD
1200
1201
GGRYYRADDA NVVRDRDLEV DTTLKSLSQQ IENIRSPEGS RKNPARTCRD
1250
1251
LKMCHSDWKS GEYWIDPNQG CNLDAIKVYC NMETGQTCVF PTQPSVPQKN
1300
1301
WYISPNPKEK KHVWFGESMT DGFPFEYGSE GSDPADVAIQ LTFLRLMSTE
1350
1351
ASQNITYHCK NSVAYMDQQT GNLKKALLLQ GSNEIELRGE GNSRFTYSTL
1400
1401
VDGCTSHTGT WGKTVIEYKT TKTSRLPIID VAPLDIGAPD QEFGLDIGPA
1450
1451
CFV                                                   
1453
 

Show the unformatted sequence.

Checksums:
CRC64:0B7F06BBB9A1D5EA
MD5:e61e0b6131f983f5cb71523cf87b7a6f

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.