Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
0  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: POLG_HCV6A (Q5I2N3)

Summary

This is the summary of UniProt entry POLG_HCV6A (Q5I2N3).

Description: Genome polyprotein EC=3.4.22.- EC=3.4.21.98 EC=3.6.1.15 EC=3.6.4.13 EC=2.7.7.48
Source organism: Hepatitis C virus genotype 6a (isolate 6a33) (HCV) (NCBI taxonomy ID )
Length: 3019 amino acids
Reference Proteome: x

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
Pfam HCV_capsid 2 115
Pfam HCV_core 116 190
Pfam HCV_env 193 382
Pfam HCV_NS1 385 734
Pfam HCV_NS2 816 1010
Pfam Peptidase_S29 1061 1209
Pfam Flavi_DEAD 1240 1367
Pfam HCV_NS4a 1663 1716
Pfam HCV_NS4b 1733 1926
Pfam HCV_NS5a 1979 2001
Pfam HCV_NS5a_1a 2011 2072
Pfam HCV_NS5a_1b 2073 2173
Pfam HCV_NS5a_C 2184 2428
Pfam RdRP_3 2431 2942

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession Q5I2N3. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MSTLPKPQRK TKRNTNRRPM DVKFPGGGQI VGGVYLLPRR GPRLGVRATR
50
51
KTSERSQPRG RRQPIPKARQ PQGRHWAQPG YPWPLYGNEG CGWAGWLLSP
100
101
RGSRPHWGPN DPRRRSRNLG KVIDTLTCGF ADLMGYIPVV GAPLGGVAAA
150
151
LAHGVRAIED GINYATGNLP GCSFSIFLLA LLSCLTTPAS ALTYGNSSGL
200
201
YHLTNDCPNS SIVLEADAMI LHLPGCLPCV KVGNQSTCWH AVSPTLAIPN
250
251
ASTPATGFRR HVDLLAGAAV VCSSLYIGDL CGSLFLAGQL FTFQPRRHWT
300
301
VQECNCSIYT GHVTGHRMAW DMMMSWSPTT TLVLSSILRV PEICASVIFG
350
351
GHWGILLAVA YFGMAGNWLK VLAVLFLFAG VEATTTVGHG VARTTAGITG
400
401
LFSPGASQNL QLIKNGSSWH INRTALNCND SLQTGFLASL FYVRKFNSSG
450
451
CPERMAVCKS LADFRQGWGQ ITYKVNISGP SDDRPYCWHY APRPCDVVPA
500
501
STVCGPVYCF TPSPVVIGTT DRRGNPTYTW GENETDVFML ESLRPPTGGW
550
551
FGCTWMNSTG FTKTCGAPPC QIIPGDYNSS ANELLCPTDC FRKHPEATYQ
600
601
RCGSGPWVTP RCLVDYPYRL WHYPCTVNFT VHKVRMFVGG IEHRFDAACN
650
651
WTRGERCELH DRDRIEMSPL LFSTTQLAIL PCSFSTMPAL STGLIHLHQN
700
701
IVDVQYLYGV SSSVTSWVVK WEYIVLMFLV LADARICTCL WLMLLISNVE
750
751
AAVERLVVLN AASAAGTAGW WWAVLFLCCV WYVKGRLVPA CTYMALGMWP
800
801
LLLTILALPH RAYAMDNEQA ASLGAVGLLA ITIFTITPTY KKLLTCFIWW
850
851
NQYFLARAEA MVHEWVPDLR VRGGRDSIIL LTCLLHPQLG FEVTKILLAI
900
901
LAPLYILQYS LLKVPYFVRA HILLRACLLV RRLAGGRYVQ ACLLRLGAWT
950
951
GTFIYDHLAP LSDWASDGLR DLAVAVEPVI FSPMEKKIIT WGADTAACGD
1000
1001
ILSGLPVSAR LGNLVLLGPA DDMQRGGWKL LAPITAYAQQ TRGLVGTIVT
1050
1051
SLTGRDKNEV EGEVQVVSTA TQSFLATSIN GVMWTVYHGA GSKTLAGPKG
1100
1101
PVCQMYTNVD KDLVGWPSPP GARSLTPCTC GSSDLYLVTR EADVIPARRR
1150
1151
GDNRAALLSP RPISTLKGSS GGPVMCPSGH VVGLFRAAVC TRGVAKSLDF
1200
1201
IPVENMETTM RSPSFTDNST PPAVPQTYQV GYLHAPTGSG KSTRVPAAYA
1250
1251
SQGYKVLVLN PSVAATLSFG SYMRQAYGVE PNVRTGVRTV TTGGAITYST
1300
1301
YGKFLADGGC SGGAYDIIIC DECHSTDPTT VLGIGTVLDQ AETAGARLTV
1350
1351
LATATPPGSI TVPHPNITET ALPTTGEIPF YGKAIPLEYI KGGRHLIFCH
1400
1401
SKKKCDELAG KLKSLGLNAV AFYRGVDVSV IPTSGDVVIC ATDALMTGYT
1450
1451
GDFDSVIDCN VAVTQVVDFS LDPTFSIETT TVPQDAVSRS QRRGRTGRGK
1500
1501
PGVYRFVSQG ERPSGMFDTV VLCEAYDTGC AWYELTPSET TVRLRAYMNT
1550
1551
PGLPVCQDHL EFWEGVFTGL THIDAHFLSQ TKQGGENFAY LVAYQATVCA
1600
1601
RAKAPPPSWD TMWKCLIRLK PTLTGPTPLL YRLGAVQNEI ITTHPITKYI
1650
1651
MTCMSADLEV ITSTWVLVGG VLAALAAYCL SVGCVVICGR ITLTGKPAVV
1700
1701
PDREILYQQF DEMEECSRHI PYLAEGQQIA EQFRQKVLGL LQASAKQAEE
1750
1751
LKPAVHSAWP RMEEFWRKHM WNFVSGIQYL AGLSTLPGNP AVASLMSFTA
1800
1801
SLTSPLRTSQ TLLLNILGGW IAAQVAPPPA STAFVVSGLA GAAVGSIRLG
1850
1851
RVLVDVLAGY GAGVSGALVA FKIMSGDCPT TEDMVNLLPA LLSPGALVVG
1900
1901
VVCAAILRRH VGPAEGANQW MNRLIAFASR GNHVSPTHYV PETDASKNVT
1950
1951
QILTSLTITS LLRRLHQWVN EDTATPCATS WLRDVWDWVC TVLSDFKVWL
2000
2001
QAKLFPRLPG IPFLSCQTGY RGVWAGDGVC HTTCTCGAVI AGHVKNGTMK
2050
2051
ITGPKTCSNT WHGTFPINAT TTGPSTPRPA PNYQRALWRV SAEDYVEVRR
2100
2101
LGDCHYVVGV TAEGLKCPCQ VPAPEFFTEV DGVRIHRYAP PCKPLLRDEV
2150
2151
TFSVGLSNYA IGSQLPCEPE PDVTVVTSML TDPTHITAET ASRRLKRGSP
2200
2201
PSLASSSASQ LSAPSLKATC TTSKDHPDME LIEANLLWRQ EMGGNITRVE
2250
2251
SENKVVVLDS FEPLTAEYDE REISVSAECH RPPRHKFPPA LPIWARPDYN
2300
2301
PPLLQAWQMP GYEPPVVSGC AVAPPKPAPI PPPRRKRLVH LDESTVSRAL
2350
2351
AQLADKVFVE GSSDPGPSSD SGLSITSPDP PAPTTPDDAC SEAESYSSMP
2400
2401
PLEGEPGDPD LSSGSWSTVS DQDDVVCCSM SYSWTGALIT PCAAEEEKLP
2450
2451
INPLSNSLIR HHNMVYSTTS RSASLRQKKV TFDRLQVFDQ HYQDVLKEIK
2500
2501
LRASTVQARL LSIEEACDLT PSHSARSKYG YGAQDVRSHA SKAINHIRSV
2550
2551
WEDLLEDSDT PIPTTIMAKN EVFCVDPSKG GRKPARLIVY PDLGVRVCEK
2600
2601
MALYDVTRKL PQAVMGSAYG FQYSPNQRVE YLLKMWRSKK VPMGFSYDTR
2650
2651
CFDSTVTERD IRTENDIYQS CQLDPVARRA VSSLTERLYV GGPMVNSKGQ
2700
2701
SCGYRRCRAS GVLPTSMGNT LTCYLKAQAA CRAANIKDCD MLVCGDDLVV
2750
2751
ICESAGVQED TASLRAFTDA MTRYSAPPGD VPQPTYDLEL ITSCSSNVSV
2800
2801
AHDGNGKRYY YLTRDCTTPL ARAAWETARH TPVNSWLGNI IMFAPTIWVR
2850
2851
MVLMTHFFSI LQSQEQLEKA LDFDIYGVTY SVSPLDLPAI IQRLHGMAAF
2900
2901
SLHGYSPTEL NRVGACLRKL GVPPLRAWRH RARAVRAKLI AQGGKAAICG
2950
2951
KYLFNWAVKT KLKLTPLVSA SKLDLSGWFV AGYDGGDIYH SVSQARPRLL
3000
3001
LLGLLLLTVG VGIFLVPAR                                  
3019
 

Show the unformatted sequence.

Checksums:
CRC64:FF1161164B164DF3
MD5:a00eb5f6c94ed48ee4541c1dd13b34c3