Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
0  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: COCA1_HUMAN (Q99715)

Summary

This is the summary of UniProt entry COCA1_HUMAN (Q99715).

Description: Collagen alpha-1(XII) chain
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
Length: 3063 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
sig_p n/a 1 23
low_complexity n/a 4 17
Pfam fn3 26 105
disorder n/a 57 58
disorder n/a 66 72
low_complexity n/a 68 88
disorder n/a 117 126
Pfam VWA 140 311
low_complexity n/a 323 337
Pfam fn3 335 416
Pfam VWA 440 611
Pfam fn3 633 712
disorder n/a 712 731
Pfam fn3 724 801
disorder n/a 762 781
disorder n/a 799 829
Pfam fn3 815 894
disorder n/a 831 833
disorder n/a 855 869
disorder n/a 897 913
Pfam fn3 906 986
disorder n/a 953 961
disorder n/a 984 1016
Pfam fn3 999 1077
disorder n/a 1027 1028
disorder n/a 1037 1049
disorder n/a 1051 1054
disorder n/a 1079 1095
Pfam fn3 1088 1168
disorder n/a 1098 1105
disorder n/a 1116 1119
disorder n/a 1169 1170
disorder n/a 1172 1174
Pfam VWA 1199 1370
disorder n/a 1316 1319
Pfam fn3 1386 1465
disorder n/a 1387 1389
disorder n/a 1468 1470
Pfam fn3 1475 1556
disorder n/a 1508 1509
disorder n/a 1514 1529
disorder n/a 1558 1565
Pfam fn3 1567 1646
disorder n/a 1568 1573
disorder n/a 1607 1615
disorder n/a 1636 1663
Pfam fn3 1656 1734
disorder n/a 1666 1668
disorder n/a 1703 1707
disorder n/a 1731 1761
Pfam fn3 1754 1835
disorder n/a 1782 1809
disorder n/a 1812 1813
disorder n/a 1820 1852
Pfam fn3 1845 1926
disorder n/a 1858 1865
disorder n/a 1875 1879
disorder n/a 1887 1891
disorder n/a 1893 1897
low_complexity n/a 1906 1917
disorder n/a 1911 1940
low_complexity n/a 1917 1930
Pfam fn3 1937 2016
disorder n/a 1947 1955
disorder n/a 1965 1968
disorder n/a 1975 1983
disorder n/a 2010 2032
Pfam fn3 2026 2108
disorder n/a 2068 2071
disorder n/a 2075 2081
disorder n/a 2108 2112
Pfam fn3 2117 2195
Pfam fn3 2205 2283
disorder n/a 2235 2238
disorder n/a 2241 2250
disorder n/a 2253 2256
disorder n/a 2283 2314
low_complexity n/a 2293 2312
Pfam VWA 2323 2495
disorder n/a 2725 2727
disorder n/a 2732 3063
Pfam Collagen 2745 2802
low_complexity n/a 2746 2767
low_complexity n/a 2770 2797
Pfam Collagen 2800 2853
low_complexity n/a 2816 2841
Pfam Collagen 2844 2902
low_complexity n/a 2850 2877
Pfam Collagen 2937 2994
low_complexity n/a 2940 2971
low_complexity n/a 2989 3008
low_complexity n/a 3008 3041

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession Q99715. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MRSRLPPALA ALGAALLLSS IEAEVDPPSD LNFKIIDENT VHMSWAKPVD
50
51
PIVGYRITVD PTTDGPTKEF TLSASTTETL LSELVPETEY VVTITSYDEV
100
101
EESVPVIGQL TIQTGSSTKP VEKKPGKTEI QKCSVSAWTD LVFLVDGSWS
150
151
VGRNNFKYIL DFIAALVSAF DIGEEKTRVG VVQYSSDTRT EFNLNQYYQR
200
201
DELLAAIKKI PYKGGNTMTG DAIDYLVKNT FTESAGARVG FPKVAIIITD
250
251
GKSQDEVEIP ARELRNVGVE VFSLGIKAAD AKELKQIAST PSLNHVFNVA
300
301
NFDAIVDIQN EIISQVCSGV DEQLGELVSG EEVVEPPSNL IAMEVSSKYV
350
351
KLNWNPSPSP VTGYKVILTP MTAGSRQHAL SVGPQTTTLS VRDLSADTEY
400
401
QISVSAMKGM TSSEPISIME KTQPMKVQVE CSRGVDIKAD IVFLVDGSYS
450
451
IGIANFVKVR AFLEVLVKSF EISPNRVQIS LVQYSRDPHT EFTLKKFTKV
500
501
EDIIEAINTF PYRGGSTNTG KAMTYVREKI FVPSKGSRSN VPKVMILITD
550
551
GKSSDAFRDP AIKLRNSDVE IFAVGVKDAV RSELEAIASP PAETHVFTVE
600
601
DFDAFQRISF ELTQSICLRI EQELAAIKKK AYVPPKDLSF SEVTSYGFKT
650
651
NWSPAGENVF SYHITYKEAA GDDEVTVVEP ASSTSVVLSS LKPETLYLVN
700
701
VTAEYEDGFS IPLAGEETTE EVKGAPRNLK VTDETTDSFK ITWTQAPGRV
750
751
LRYRIIYRPV AGGESREVTT PPNQRRRTLE NLIPDTKYEV SVIPEYFSGP
800
801
GTPLTGNAAT EEVRGNPRDL RVSDPTTSTM KLSWSGAPGK VKQYLVTYTP
850
851
VAGGETQEVT VRGDTTNTVL QGLKEGTQYA LSVTALYASG AGDALFGEGT
900
901
TLEERGSPQD LVTKDITDTS IGAYWTSAPG MVRGYRVSWK SLYDDVDTGE
950
951
KNLPEDAIHT MIENLQPETK YRISVFATYS SGEGEPLTGD ATTELSQDSK
1000
1001
TLKVDEETEN TMRVTWKPAP GKVVNYRVVY RPHGRGKQMV AKVPPTVTST
1050
1051
VLKRLQPQTT YDITVLPIYK MGEGKLRQGS GTTASRFKSP RNLKTSDPTM
1100
1101
SSFRVTWEPA PGEVKGYKVT FHPTGDDRRL GELVVGPYDN TVVLEELRAG
1150
1151
TTYKVNVFGM FDGGESSPLV GQEMTTLSDT TVMPILSSGM ECLTRAEADI
1200
1201
VLLVDGSWSI GRANFRTVRS FISRIVEVFD IGPKRVQIAL AQYSGDPRTE
1250
1251
WQLNAHRDKK SLLQAVANLP YKGGNTLTGM ALNFIRQQNF RTQAGMRPRA
1300
1301
RKIGVLITDG KSQDDVEAPS KKLKDEGVEL FAIGIKNADE VELKMIATDP
1350
1351
DDTHAYNVAD FESLSRIVDD LTINLCNSVK GPGDLEAPSN LVISERTHRS
1400
1401
FRVSWTPPSD SVDRYKVEYY PVSGGKRQEF YVSRMETSTV LKDLKPETEY
1450
1451
VVNVYSVVED EYSEPLKGTE KTLPVPVVSL NIYDVGPTTM HVQWQPVGGA
1500
1501
TGYILSYKPV KDTEPTRPKE VRLGPTVNDM QLTDLVPNTE YAVTVQAVLH
1550
1551
DLTSEPVTVR EVTLPLPRPQ DLKLRDVTHS TMNVFWEPVP GKVRKYIVRY
1600
1601
KTPEEDVKEV EVDRSETSTS LKDLFSQTLY TVSVSAVHDE GESPPVTAQE
1650
1651
TTRPVPAPTN LKITEVTSEG FRGTWDHGAS DVSLYRITWA PFGSSDKMET
1700
1701
ILNGDENTLV FENLNPNTIY EVSITAIYPD ESESDDLIGS ERTLPILTTQ
1750
1751
APKSGPRNLQ VYNATSNSLT VKWDPASGRV QKYRITYQPS TGEGNEQTTT
1800
1801
IGGRQNSVVL QKLKPDTPYT ITVSSLYPDG EGGRMTGRGK TKPLNTVRNL
1850
1851
RVYDPSTSTL NVRWDHAEGN PRQYKLFYAP AAGGPEELVP IPGNTNYAIL
1900
1901
RNLQPDTSYT VTVVPVYTEG DGGRTSDTGR TLMRGLARNV QVYNPTPNSL
1950
1951
DVRWDPAPGP VLQYRVVYSP VDGTRPSESI VVPGNTRMVH LERLIPDTLY
2000
2001
SVNLVALYSD GEGNPSPAQG RTLPRSGPRN LRVFGETTNS LSVAWDHADG
2050
2051
PVQQYRIIYS PTVGDPIDEY TTVPGRRNNV ILQPLQPDTP YKITVIAVYE
2100
2101
DGDGGHLTGN GRTVGLLPPQ NIHISDEWYT RFRVSWDPSP SPVLGYKIVY
2150
2151
KPVGSNEPME AFVGEMTSYT LHNLNPSTTY DVNVYAQYDS GLSVPLTDQG
2200
2201
TTLYLNVTDL KTYQIGWDTF CVKWSPHRAA TSYRLKLSPA DGTRGQEITV
2250
2251
RGSETSHCFT GLSPDTDYGV TVFVQTPNLE GPGVSVKEHT TVKPTEAPTE
2300
2301
PPTPPPPPTI PPARDVCKGA KADIVFLTDA SWSIGDDNFN KVVKFIFNTV
2350
2351
GGFDEISPAG IQVSFVQYSD EVKSEFKLNT YNDKALALGA LQNIRYRGGN
2400
2401
TRTGKALTFI KEKVLTWESG MRKNVPKVLV VVTDGRSQDE VKKAALVIQQ
2450
2451
SGFSVFVVGV ADVDYNELAN IASKPSERHV FIVDDFESFE KIEDNLITFV
2500
2501
CETATSSCPL IYLDGYTSPG FKMLEAYNLT EKNFASVQGV SLESGSFPSY
2550
2551
SAYRIQKNAF VNQPTADLHP NGLPPSYTII LLFRLLPETP SDPFAIWQIT
2600
2601
DRDYKPQVGV IADPSSKTLS FFNKDTRGEV QTVTFDTEEV KTLFYGSFHK
2650
2651
VHIVVTSKSV KIYIDCYEII EKDIKEAGNI TTDGYEILGK LLKGERKSAA
2700
2701
FQIQSFDIVC SPVWTSRDRC CDIPSRRDEG KCPAFPNSCT CTQDSVGPPG
2750
2751
PPGPAGGPGA KGPRGERGIS GAIGPPGPRG DIGPPGPQGP PGPQGPNGLS
2800
2801
IPGEQGRQGM KGDAGEPGLP GRTGTPGLPG PPGPMGPPGD RGFTGKDGAM
2850
2851
GPRGPPGPPG SPGSPGVTGP SGKPGKPGDH GRPGPSGLKG EKGDRGDIAS
2900
2901
QNMMRAVARQ VCEQLISGQM NRFNQMLNQI PNDYQSSRNQ PGPPGPPGPP
2950
2951
GSAGARGEPG PGGRPGFPGT PGMQGPPGER GLPGEKGERG TGSSGPRGLP
3000
3001
GPPGPQGESR TGPPGSTGSR GPPGPPGRPG NSGIRGPPGP PGYCDSSQCA
3050
3051
SIPYNGQGYP GSG                                        
3063
 

Show the unformatted sequence.

Checksums:
CRC64:EA38CAFECE8393D2
MD5:61b7485266c808990fc2865f2a9141d8

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.