Please note: this site relies heavily on the use of javascript. Without a javascript-enabled browser, this site will not function correctly. Please enable javascript and reload the page, or switch to a different browser.
3  structures 1  species 0  interactions 1  sequence 1  architecture

Protein: R1A_CVEMC (K9N638)

Summary

This is the summary of UniProt entry R1A_CVEMC (K9N638).

Description: Replicase polyprotein 1a EC=3.4.19.12 EC=3.4.22.69 EC=3.4.22.-
Source organism: Human coronavirus EMC (isolate United Kingdom/H123990006/2012) (HCoV-EMC) (NCBI taxonomy ID 1263720)
View Pfam proteome data.
Length: 4391 amino acids
Reference Proteome: ✓

Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.

Pfam domains

Download the data used to generate the domain graphic in JSON format.

Show or hide the data used to generate the graphic in JSON format.

Source Domain Start End
low_complexity n/a 225 238
low_complexity n/a 780 796
low_complexity n/a 927 938
disorder n/a 970 1001
disorder n/a 1005 1014
disorder n/a 1016 1032
low_complexity n/a 1024 1036
disorder n/a 1051 1073
disorder n/a 1075 1090
disorder n/a 1092 1102
Pfam Macro 1143 1249
low_complexity n/a 1152 1166
Pfam SUD-M 1280 1396
Pfam Viral_protease 1481 1806
Pfam NAR 1841 1952
low_complexity n/a 2077 2087
transmembrane n/a 2105 2125
low_complexity n/a 2172 2189
transmembrane n/a 2175 2195
transmembrane n/a 2281 2300
transmembrane n/a 2307 2325
transmembrane n/a 2337 2362
transmembrane n/a 2756 2777
low_complexity n/a 2838 2847
transmembrane n/a 3024 3049
transmembrane n/a 3061 3092
low_complexity n/a 3088 3100
transmembrane n/a 3112 3138
Pfam Corona_NSP4_C 3151 3246
low_complexity n/a 3203 3214
Pfam Peptidase_C30 3276 3566
transmembrane n/a 3564 3582
transmembrane n/a 3594 3613
transmembrane n/a 3620 3638
transmembrane n/a 3669 3687
transmembrane n/a 3694 3710
transmembrane n/a 3730 3752
transmembrane n/a 3764 3786
Pfam nsp7 3846 3928
low_complexity n/a 3854 3864
Pfam nsp8 3929 4127
Pfam nsp9 4128 4237
Pfam NSP10 4246 4368

Show or hide domain scores.

Sequence information

This is the amino acid sequence of the UniProt sequence database entry with the accession K9N638. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.

Sequence:
1
MSFVAGVTAQ GARGTYRAAL NSEKHQDHVS LTVPLCGSGN LVEKLSPWFM
50
51
DGENAYEVVK AMLLKKEPLL YVPIRLAGHT RHLPGPRVYL VERLIACENP
100
101
FMVNQLAYSS SANGSLVGTT LQGKPIGMFF PYDIELVTGK QNILLRKYGR
150
151
GGYHYTPFHY ERDNTSCPEW MDDFEADPKG KYAQNLLKKL IGGDVTPVDQ
200
201
YMCGVDGKPI SAYAFLMAKD GITKLADVEA DVAARADDEG FITLKNNLYR
250
251
LVWHVERKDV PYPKQSIFTI NSVVQKDGVE NTPPHYFTLG CKILTLTPRN
300
301
KWSGVSDLSL KQKLLYTFYG KESLENPTYI YHSAFIECGS CGNDSWLTGN
350
351
AIQGFACGCG ASYTANDVEV QSSGMIKPNA LLCATCPFAK GDSCSSNCKH
400
401
SVAQLVSYLS ERCNVIADSK SFTLIFGGVA YAYFGCEEGT MYFVPRAKSV
450
451
VSRIGDSIFT GCTGSWNKVT QIANMFLEQT QHSLNFVGEF VVNDVVLAIL
500
501
SGTTTNVDKI RQLLKGVTLD KLRDYLADYD VAVTAGPFMD NAINVGGTGL
550
551
QYAAITAPYV VLTGLGESFK KVATIPYKVC NSVKDTLTYY AHSVLYRVFP
600
601
YDMDSGVSSF SELLFDCVDL SVASTYFLVR LLQDKTGDFM STIITSCQTA
650
651
VSKLLDTCFE ATEATFNFLL DLAGLFRIFL RNAYVYTSQG FVVVNGKVST
700
701
LVKQVLDLLN KGMQLLHTKV SWAGSNISAV IYSGRESLIF PSGTYYCVTT
750
751
KAKSVQQDLD VILPGEFSKK QLGLLQPTDN STTVSVTVSS NMVETVVGQL
800
801
EQTNMHSPDV IVGDYVIISE KLFVRSKEED GFAFYPACTN GHAVPTLFRL
850
851
KGGAPVKKVA FGGDQVHEVA AVRSVTVEYN IHAVLDTLLA SSSLRTFVVD
900
901
KSLSIEEFAD VVKEQVSDLL VKLLRGMPIP DFDLDDFIDA PCYCFNAEGD
950
951
ASWSSTMIFS LHPVECDEEC SEVEASDLEE GESECISETS TEQVDVSHEI
1000
1001
SDDEWAAAVD EAFPLDEAED VTESVQEEAQ PVEVPVEDIA QVVIADTLQE
1050
1051
TPVVSDTVEV PPQVVKLPSE PQTIQPEVKE VAPVYEADTE QTQSVTVKPK
1100
1101
RLRKKRNVDP LSNFEHKVIT ECVTIVLGDA IQVAKCYGES VLVNAANTHL
1150
1151
KHGGGIAGAI NAASKGAVQK ESDEYILAKG PLQVGDSVLL QGHSLAKNIL
1200
1201
HVVGPDARAK QDVSLLSKCY KAMNAYPLVV TPLVSAGIFG VKPAVSFDYL
1250
1251
IREAKTRVLV VVNSQDVYKS LTIVDIPQSL TFSYDGLRGA IRKAKDYGFT
1300
1301
VFVCTDNSAN TKVLRNKGVD YTKKFLTVDG VQYYCYTSKD TLDDILQQAN
1350
1351
KSVGIISMPL GYVSHGLDLI QAGSVVRRVN VPYVCLLANK EQEAILMSED
1400
1401
VKLNPSEDFI KHVRTNGGYN SWHLVEGELL VQDLRLNKLL HWSDQTICYK
1450
1451
DSVFYVVKNS TAFPFETLSA CRAYLDSRTT QQLTIEVLVT VDGVNFRTVV
1500
1501
LNNKNTYRSQ LGCVFFNGAD ISDTIPDEKQ NGHSLYLADN LTADETKALK
1550
1551
ELYGPVDPTF LHRFYSLKAA VHKWKMVVCD KVRSLKLSDN NCYLNAVIMT
1600
1601
LDLLKDIKFV IPALQHAFMK HKGGDSTDFI ALIMAYGNCT FGAPDDASRL
1650
1651
LHTVLAKAEL CCSARMVWRE WCNVCGIKDV VLQGLKACCY VGVQTVEDLR
1700
1701
ARMTYVCQCG GERHRQIVEH TTPWLLLSGT PNEKLVTTST APDFVAFNVF
1750
1751
QGIETAVGHY VHARLKGGLI LKFDSGTVSK TSDWKCKVTD VLFPGQKYSS
1800
1801
DCNVVRYSLD GNFRTEVDPD LSAFYVKDGK YFTSEPPVTY SPATILAGSV
1850
1851
YTNSCLVSSD GQPGGDAISL SFNNLLGFDS SKPVTKKYTY SFLPKEDGDV
1900
1901
LLAEFDTYDP IYKNGAMYKG KPILWVNKAS YDTNLNKFNR ASLRQIFDVA
1950
1951
PIELENKFTP LSVESTPVEP PTVDVVALQQ EMTIVKCKGL NKPFVKDNVS
2000
2001
FVADDSGTPV VEYLSKEDLH TLYVDPKYQV IVLKDNVLSS MLRLHTVESG
2050
2051
DINVVAASGS LTRKVKLLFR ASFYFKEFAT RTFTATTAVG SCIKSVVRHL
2100
2101
GVTKGILTGC FSFVKMLFML PLAYFSDSKL GTTEVKVSAL KTAGVVTGNV
2150
2151
VKQCCTAAVD LSMDKLRRVD WKSTLRLLLM LCTTMVLLSS VYHLYVFNQV
2200
2201
LSSDVMFEDA QGLKKFYKEV RAYLGISSAC DGLASAYRAN SFDVPTFCAN
2250
2251
RSAMCNWCLI SQDSITHYPA LKMVQTHLSH YVLNIDWLWF AFETGLAYML
2300
2301
YTSAFNWLLL AGTLHYFFAQ TSIFVDWRSY NYAVSSAFWL FTHIPMAGLV
2350
2351
RMYNLLACLW LLRKFYQHVI NGCKDTACLL CYKRNRLTRV EASTVVCGGK
2400
2401
RTFYITANGG ISFCRRHNWN CVDCDTAGVG NTFICEEVAN DLTTALRRPI
2450
2451
NATDRSHYYV DSVTVKETVV QFNYRRDGQP FYERFPLCAF TNLDKLKFKE
2500
2501
VCKTTTGIPE YNFIIYDSSD RGQESLARSA CVYYSQVLCK SILLVDSSLV
2550
2551
TSVGDSSEIA TKMFDSFVNS FVSLYNVTRD KLEKLISTAR DGVRRGDNFH
2600
2601
SVLTTFIDAA RGPAGVESDV ETNEIVDSVQ YAHKHDIQIT NESYNNYVPS
2650
2651
YVKPDSVSTS DLGSLIDCNA ASVNQIVLRN SNGACIWNAA AYMKLSDALK
2700
2701
RQIRIACRKC NLAFRLTTSK LRANDNILSV RFTANKIVGG APTWFNALRD
2750
2751
FTLKGYVLAT IIVFLCAVLM YLCLPTFSMV PVEFYEDRIL DFKVLDNGII
2800
2801
RDVNPDDKCF ANKHRSFTQW YHEHVGGVYD NSITCPLTVA VIAGVAGARI
2850
2851
PDVPTTLAWV NNQIIFFVSR VFANTGSVCY TPIDEIPYKS FSDSGCILPS
2900
2901
ECTMFRDAEG RMTPYCHDPT VLPGAFAYSQ MRPHVRYDLY DGNMFIKFPE
2950
2951
VVFESTLRIT RTLSTQYCRF GSCEYAQEGV CITTNGSWAI FNDHHLNRPG
3000
3001
VYCGSDFIDI VRRLAVSLFQ PITYFQLTTS LVLGIGLCAF LTLLFYYINK
3050
3051
VKRAFADYTQ CAVIAVVAAV LNSLCICFVA SIPLCIVPYT ALYYYATFYF
3100
3101
TNEPAFIMHV SWYIMFGPIV PIWMTCVYTV AMCFRHFFWV LAYFSKKHVE
3150
3151
VFTDGKLNCS FQDAASNIFV INKDTYAALR NSLTNDAYSR FLGLFNKYKY
3200
3201
FSGAMETAAY REAAACHLAK ALQTYSETGS DLLYQPPNCS ITSGVLQSGL
3250
3251
VKMSHPSGDV EACMVQVTCG SMTLNGLWLD NTVWCPRHVM CPADQLSDPN
3300
3301
YDALLISMTN HSFSVQKHIG APANLRVVGH AMQGTLLKLT VDVANPSTPA
3350
3351
YTFTTVKPGA AFSVLACYNG RPTGTFTVVM RPNYTIKGSF LCGSCGSVGY
3400
3401
TKEGSVINFC YMHQMELANG THTGSAFDGT MYGAFMDKQV HQVQLTDKYC
3450
3451
SVNVVAWLYA AILNGCAWFV KPNRTSVVSF NEWALANQFT EFVGTQSVDM
3500
3501
LAVKTGVAIE QLLYAIQQLY TGFQGKQILG STMLEDEFTP EDVNMQIMGV
3550
3551
VMQSGVRKVT YGTAHWLFAT LVSTYVIILQ ATKFTLWNYL FETIPTQLFP
3600
3601
LLFVTMAFVM LLVKHKHTFL TLFLLPVAIC LTYANIVYEP TTPISSALIA
3650
3651
VANWLAPTNA YMRTTHTDIG VYISMSLVLV IVVKRLYNPS LSNFALALCS
3700
3701
GVMWLYTYSI GEASSPIAYL VFVTTLTSDY TITVFVTVNL AKVCTYAIFA
3750
3751
YSPQLTLVFP EVKMILLLYT CLGFMCTCYF GVFSLLNLKL RAPMGVYDFK
3800
3801
VSTQEFRFMT ANNLTAPRNS WEAMALNFKL IGIGGTPCIK VAAMQSKLTD
3850
3851
LKCTSVVLLS VLQQLHLEAN SRAWAFCVKC HNDILAATDP SEAFEKFVSL
3900
3901
FATLMTFSGN VDLDALASDI FDTPSVLQAT LSEFSHLATF AELEAAQKAY
3950
3951
QEAMDSGDTS PQVLKALQKA VNIAKNAYEK DKAVARKLER MADQAMTSMY
4000
4001
KQARAEDKKA KIVSAMQTML FGMIKKLDND VLNGIISNAR NGCIPLSVIP
4050
4051
LCASNKLRVV IPDFTVWNQV VTYPSLNYAG ALWDITVINN VDNEIVKSSD
4100
4101
VVDSNENLTW PLVLECTRAS TSAVKLQNNE IKPSGLKTMV VSAGQEQTNC
4150
4151
NTSSLAYYEP VQGRKMLMAL LSDNAYLKWA RVEGKDGFVS VELQPPCKFL
4200
4201
IAGPKGPEIR YLYFVKNLNN LHRGQVLGHI AATVRLQAGS NTEFASNSSV
4250
4251
LSLVNFTVDP QKAYLDFVNA GGAPLTNCVK MLTPKTGTGI AISVKPESTA
4300
4301
DQETYGGASV CLYCRAHIEH PDVSGVCKYK GKFVQIPAQC VRDPVGFCLS
4350
4351
NTPCNVCQYW IGYGCNCDSL RQAALPQSKD SNFLNESGVL L         
4391
 

Show the unformatted sequence.

Checksums:
CRC64:D0A87AE59773BBB8
MD5:b0550a2c070bbb134cde5c0f9766019b

Structures

For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the PDBe SIFTS project, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.

Pfam family UniProt residues PDB ID PDB chain ID PDB residues View
Macro 1143 - 1249 5DUS A 34 - 140 Jmol OpenAstexViewer
5HIH A 35 - 141 Jmol OpenAstexViewer
Viral_protease 1483 - 1806 4R3D A 3 - 326 Jmol OpenAstexViewer
B 3 - 326 Jmol OpenAstexViewer