Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
4  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: CO6A3_HUMAN (P12111)

Summary

This is the summary of UniProt entry CO6A3_HUMAN (P12111).

Description: Collagen alpha-3(VI) chain
Source organism: Homo sapiens (Human) (NCBI taxonomy ID 9606)
View Pfam proteome data.
Length: 3177 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
sig_p n/a 1 25
Pfam VWA 39 212
disorder n/a 227 232
Pfam VWA 242 414
Pfam VWA 445 616
Pfam VWA 639 811
low_complexity n/a 702 719
low_complexity n/a 804 815
Pfam VWA 837 1008
Pfam VWA 1029 1200
Pfam VWA 1233 1403
disorder n/a 1296 1300
Pfam VWA 1436 1608
disorder n/a 1611 1632
low_complexity n/a 1616 1629
Pfam VWA 1639 1811
low_complexity n/a 1801 1815
disorder n/a 1888 1889
disorder n/a 1962 1963
low_complexity n/a 2001 2016
Pfam Collagen 2036 2096
disorder n/a 2041 2375
low_complexity n/a 2043 2055
low_complexity n/a 2054 2076
low_complexity n/a 2103 2124
low_complexity n/a 2176 2200
low_complexity n/a 2200 2224
low_complexity n/a 2227 2239
low_complexity n/a 2242 2267
low_complexity n/a 2266 2293
low_complexity n/a 2299 2313
Pfam VWA 2402 2579
Pfam VWA 2619 2808
disorder n/a 2849 2988
low_complexity n/a 2855 2897
low_complexity n/a 2904 2929
low_complexity n/a 2927 2966
low_complexity n/a 2973 2986
disorder n/a 2990 2995
disorder n/a 3003 3006
disorder n/a 3072 3093
low_complexity n/a 3074 3093
Pfam Kunitz_BPTI 3111 3163

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession P12111. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MRKHRHLPLV AVFCLFLSGF PTTHAQQQQA DVKNGAAADI IFLVDSSWTI
50
51
GEEHFQLVRE FLYDVVKSLA VGENDFHFAL VQFNGNPHTE FLLNTYRTKQ
100
101
EVLSHISNMS YIGGTNQTGK GLEYIMQSHL TKAAGSRAGD GVPQVIVVLT
150
151
DGHSKDGLAL PSAELKSADV NVFAIGVEDA DEGALKEIAS EPLNMHMFNL
200
201
ENFTSLHDIV GNLVSCVHSS VSPERAGDTE TLKDITAQDS ADIIFLIDGS
250
251
NNTGSVNFAV ILDFLVNLLE KLPIGTQQIR VGVVQFSDEP RTMFSLDTYS
300
301
TKAQVLGAVK ALGFAGGELA NIGLALDFVV ENHFTRAGGS RVEEGVPQVL
350
351
VLISAGPSSD EIRYGVVALK QASVFSFGLG AQAASRAELQ HIATDDNLVF
400
401
TVPEFRSFGD LQEKLLPYIV GVAQRHIVLK PPTIVTQVIE VNKRDIVFLV
450
451
DGSSALGLAN FNAIRDFIAK VIQRLEIGQD LIQVAVAQYA DTVRPEFYFN
500
501
THPTKREVIT AVRKMKPLDG SALYTGSALD FVRNNLFTSS AGYRAAEGIP
550
551
KLLVLITGGK SLDEISQPAQ ELKRSSIMAF AIGNKGADQA ELEEIAFDSS
600
601
LVFIPAEFRA APLQGMLPGL LAPLRTLSGT PEVHSNKRDI IFLLDGSANV
650
651
GKTNFPYVRD FVMNLVNSLD IGNDNIRVGL VQFSDTPVTE FSLNTYQTKS
700
701
DILGHLRQLQ LQGGSGLNTG SALSYVYANH FTEAGGSRIR EHVPQLLLLL
750
751
TAGQSEDSYL QAANALTRAG ILTFCVGASQ ANKAELEQIA FNPSLVYLMD
800
801
DFSSLPALPQ QLIQPLTTYV SGGVEEVPLA QPESKRDILF LFDGSANLVG
850
851
QFPVVRDFLY KIIDELNVKP EGTRIAVAQY SDDVKVESRF DEHQSKPEIL
900
901
NLVKRMKIKT GKALNLGYAL DYAQRYIFVK SAGSRIEDGV LQFLVLLVAG
950
951
RSSDRVDGPA SNLKQSGVVP FIFQAKNADP AELEQIVLSP AFILAAESLP
1000
1001
KIGDLHPQIV NLLKSVHNGA PAPVSGEKDV VFLLDGSEGV RSGFPLLKEF
1050
1051
VQRVVESLDV GQDRVRVAVV QYSDRTRPEF YLNSYMNKQD VVNAVRQLTL
1100
1101
LGGPTPNTGA ALEFVLRNIL VSSAGSRITE GVPQLLIVLT ADRSGDDVRN
1150
1151
PSVVVKRGGA VPIGIGIGNA DITEMQTISF IPDFAVAIPT FRQLGTVQQV
1200
1201
ISERVTQLTR EELSRLQPVL QPLPSPGVGG KRDVVFLIDG SQSAGPEFQY
1250
1251
VRTLIERLVD YLDVGFDTTR VAVIQFSDDP KVEFLLNAHS SKDEVQNAVQ
1300
1301
RLRPKGGRQI NVGNALEYVS RNIFKRPLGS RIEEGVPQFL VLISSGKSDD
1350
1351
EVDDPAVELK QFGVAPFTIA RNADQEELVK ISLSPEYVFS VSTFRELPSL
1400
1401
EQKLLTPITT LTSEQIQKLL ASTRYPPPAV ESDAADIVFL IDSSEGVRPD
1450
1451
GFAHIRDFVS RIVRRLNIGP SKVRVGVVQF SNDVFPEFYL KTYRSQAPVL
1500
1501
DAIRRLRLRG GSPLNTGKAL EFVARNLFVK SAGSRIEDGV PQHLVLVLGG
1550
1551
KSQDDVSRFA QVIRSSGIVS LGVGDRNIDR TELQTITNDP RLVFTVREFR
1600
1601
ELPNIEERIM NSFGPSAATP APPGVDTPPP SRPEKKKADI VFLLDGSINF
1650
1651
RRDSFQEVLR FVSEIVDTVY EDGDSIQVGL VQYNSDPTDE FFLKDFSTKR
1700
1701
QIIDAINKVV YKGGRHANTK VGLEHLRVNH FVPEAGSRLD QRVPQIAFVI
1750
1751
TGGKSVEDAQ DVSLALTQRG VKVFAVGVRN IDSEEVGKIA SNSATAFRVG
1800
1801
NVQELSELSE QVLETLHDAM HETLCPGVTD AAKACNLDVI LGFDGSRDQN
1850
1851
VFVAQKGFES KVDAILNRIS QMHRVSCSGG RSPTVRVSVV ANTPSGPVEA
1900
1901
FDFDEYQPEM LEKFRNMRSQ HPYVLTEDTL KVYLNKFRQS SPDSVKVVIH
1950
1951
FTDGADGDLA DLHRASENLR QEGVRALILV GLERVVNLER LMHLEFGRGF
2000
2001
MYDRPLRLNL LDLDYELAEQ LDNIAEKACC GVPCKCSGQR GDRGPIGSIG
2050
2051
PKGIPGEDGY RGYPGDEGGP GERGPPGVNG TQGFQGCPGQ RGVKGSRGFP
2100
2101
GEKGEVGEIG LDGLDGEDGD KGLPGSSGEK GNPGRRGDKG PRGEKGERGD
2150
2151
VGIRGDPGNP GQDSQERGPK GETGDLGPMG VPGRDGVPGG PGETGKNGGF
2200
2201
GRRGPPGAKG NKGGPGQPGF EGEQGTRGAQ GPAGPAGPPG LIGEQGISGP
2250
2251
RGSGGAAGAP GERGRTGPLG RKGEPGEPGP KGGIGNRGPR GETGDDGRDG
2300
2301
VGSEGRRGKK GERGFPGYPG PKGNPGEPGL NGTTGPKGIR GRRGNSGPPG
2350
2351
IVGQKGDPGY PGPAGPKGNR GDSIDQCALI QSIKDKCPCC YGPLECPVFP
2400
2401
TELAFALDTS EGVNQDTFGR MRDVVLSIVN DLTIAESNCP RGARVAVVTY
2450
2451
NNEVTTEIRF ADSKRKSVLL DKIKNLQVAL TSKQQSLETA MSFVARNTFK
2500
2501
RVRNGFLMRK VAVFFSNTPT RASPQLREAV LKLSDAGITP LFLTRQEDRQ
2550
2551
LINALQINNT AVGHALVLPA GRDLTDFLEN VLTCHVCLDI CNIDPSCGFG
2600
2601
SWRPSFRDRR AAGSDVDIDM AFILDSAETT TLFQFNEMKK YIAYLVRQLD
2650
2651
MSPDPKASQH FARVAVVQHA PSESVDNASM PPVKVEFSLT DYGSKEKLVD
2700
2701
FLSRGMTQLQ GTRALGSAIE YTIENVFESA PNPRDLKIVV LMLTGEVPEQ
2750
2751
QLEEAQRVIL QAKCKGYFFV VLGIGRKVNI KEVYTFASEP NDVFFKLVDK
2800
2801
STELNEEPLM RFGRLLPSFV SSENAFYLSP DIRKQCDWFQ GDQPTKNLVK
2850
2851
FGHKQVNVPN NVTSSPTSNP VTTTKPVTTT KPVTTTTKPV TTTTKPVTII
2900
2901
NQPSVKPAAA KPAPAKPVAA KPVATKMATV RPPVAVKPAT AAKPVAAKPA
2950
2951
AVRPPAAAAA KPVATKPEVP RPQAAKPAAT KPATTKPMVK MSREVQVFEI
3000
3001
TENSAKLHWE RAEPPGPYFY DLTVTSAHDQ SLVLKQNLTV TDRVIGGLLA
3050
3051
GQTYHVAVVC YLRSQVRATY HGSFSTKKSQ PPPPQPARSA SSSTINLMVS
3100
3101
TEPLALTETD ICKLPKDEGT CRDFILKWYY DPNTKSCARF WYGGCGGNEN
3150
3151
KFGSQKECEK VCAPVLAKPG VISVMGT                         
3177
 

Show the unformatted sequence.

Checksums:
CRC64:56D54CAC4FBB30AF
MD5:4994d9c6f0655700dc0be936bf969167

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe SIFTS project, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
Kunitz_BPTI 3111 - 3163 1KNT A 4 - 56 Jmol OpenAstexViewer
1KTH A 4 - 56 Jmol OpenAstexViewer
1KUN A 4 - 56 Jmol OpenAstexViewer
2KNT A 4 - 56 Jmol OpenAstexViewer

TreeFam

Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.