ID POLG_WNV STANDARD; PRT; 3430 AA. AC P06935; DT 01-JAN-1988 (Rel. 06, Created) DT 01-JAN-1988 (Rel. 06, Last sequence update) DT 28-FEB-2003 (Rel. 41, Last annotation update) DE Genome polyprotein [Contains: Capsid protein C (Core protein); Matrix DE protein (Envelope protein M); Major envelope protein E; Nonstructural DE proteins NS1, NS2A, NS2B, NS4A and NS4B; Protease/helicase DE (EC 3.4.21.98) (NS3); RNA-directed RNA polymerase (EC 2.7.7.48) DE (NS5)]. OS West Nile virus (WN). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus. OX NCBI_TaxID=11082; RN [1] RP SEQUENCE FROM N.A. RX MEDLINE=86124703; PubMed=3753811; RA Castle E., Leidner U., Nowak T., Wengler G., Wengler G.; RT "Primary structure of the West Nile flavivirus genome region coding RT for all nonstructural proteins."; RL Virology 149:10-26(1986). RN [2] RP SEQUENCE OF 1-291 FROM N.A. RX MEDLINE=85274372; PubMed=2992152; RA Castle E., Nowak T., Leidner U., Wengler G., Wengler G.; RT "Sequence analysis of the viral core protein and the RT membrane-associated proteins V1 and NV2 of the flavivirus West Nile RT virus and of the genome sequence for these proteins."; RL Virology 145:227-236(1985). RN [3] RP SEQUENCE OF 255-854 FROM N.A. RX MEDLINE=86072082; PubMed=3855247; RA Wengler G., Castle E., Leidner U., Nowak T., Wengler G.; RT "Sequence analysis of the membrane protein V3 of the flavivirus West RT Nile virus and of its gene."; RL Virology 147:264-274(1985). RN [4] RP DISULFIDE BONDS IN E PROTEIN. RX MEDLINE=87122143; PubMed=3811228; RA Nowak T., Wengler G.; RT "Analysis of disulfides present in the membrane proteins of the West RT Nile flavivirus."; RL Virology 156:127-137(1987). CC -!- FUNCTION: THE SMALL PROTEINS NS2A, NS2B, NS4A AND NS4B ARE CC HYDROPHOBIC, SUGGESTING A POSSIBLE MEMBRANE-RELATED FUNCTION. CC NS3 AND NS5 MAY PLAY A ROLE IN THE VIRAL RNA REPLICATION. CC -!- CATALYTIC ACTIVITY: Hydrolysis of four peptide bonds in the viral CC precursor polyprotein, commonly with Asp or Glu in the P6 CC position, Cys or Thr in P1 and Ser or Ala in P1'. CC -!- CATALYTIC ACTIVITY: N nucleoside triphosphate = N diphosphate + CC {RNA}(N). CC -!- SUBUNIT: THE VIRION OF THIS VIRUS IS A NUCLEOCAPSID COVERED BY A CC LIPOPROTEIN ENVELOPE. THE ENVELOPE CONSISTS OF TWO PROTEINS: CC PROTEIN M AND GLYCOPROTEIN E. THE NUCLEOCAPSID IS A COMPLEX OF CC PROTEIN C AND MRNA. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M12294; AAA48498.1; -. DR PIR; A25256; GNWVWV. DR HSSP; P14336; 1SVB. DR MEROPS; S07.001; -. DR InterPro; IPR001410; DEAD. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR000336; Flavi_glycoprotE. DR InterPro; IPR001850; Flavi_helicase. DR InterPro; IPR000069; Flavi_M. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR000404; Flavi_NS4A. DR InterPro; IPR001528; Flavi_NS4B. DR InterPro; IPR000208; Flavi_NS5. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR002877; FtsJ. DR InterPro; IPR001650; Helicase_C. DR InterPro; IPR007095; RNA_pol_DS_PS. DR InterPro; IPR007094; RNA_pol_PSvir. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF00949; Flavi_helicase; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01350; Flavi_NS4A; 1. DR Pfam; PF01349; Flavi_NS4B; 1. DR Pfam; PF00972; Flavi_NS5; 1. DR Pfam; PF01570; Flavi_propep; 1. DR Pfam; PF01728; FtsJ; 1. DR Pfam; PF00271; helicase_C; 1. DR ProDom; PD001556; Flavi_glycoprotE; 1. DR ProDom; PD001496; Flavi_NS1; 1. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. KW Polyprotein; Glycoprotein; Transferase; RNA-directed RNA polymerase; KW Core protein; Coat protein; Envelope protein; Hydrolase; Helicase; KW ATP-binding; Transmembrane; Nonstructural protein. FT INIT_MET 1 1 REMOVED FROM CAPSID PROTEIN C BY THE FT CELLULAR AMINOPEPTIDASE. FT CHAIN 1 123 CAPSID PROTEIN C. FT PROPEP 124 215 FT CHAIN 216 290 ENVELOPE GLYCOPROTEIN M. FT CHAIN 291 787 MAJOR ENVELOPE PROTEIN E. FT CHAIN 788 1139 NONSTRUCTURAL PROTEIN NS1. FT CHAIN 1140 1370 NONSTRUCTURAL PROTEIN NS2A. FT CHAIN 1371 1501 NONSTRUCTURAL PROTEIN NS2B. FT CHAIN 1502 2120 PROTEASE/HELICASE (NS3). FT CHAIN 2121 2269 NONSTRUCTURAL PROTEIN NS4A. FT CHAIN 2270 2525 NONSTRUCTURAL PROTEIN NS4B. FT CHAIN 2526 3430 RNA-DIRECTED RNA POLYMERASE (NS5). FT DOMAIN 388 401 INVOLVED IN FUSION. FT NP_BIND 1695 1702 ATP (POTENTIAL). FT SITE 1786 1789 DEAH BOX. FT DISULFID 293 320 FT DISULFID 350 406 FT DISULFID 364 395 FT DISULFID 382 411 FT DISULFID 476 574 FT DISULFID 591 622 FT CARBOHYD 138 138 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 917 917 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 962 962 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 994 994 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 1289 1289 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 2336 2336 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 2489 2489 N-LINKED (GLCNAC...) (POTENTIAL). SQ SEQUENCE 3430 AA; 379624 MW; 12EAA7E81F01CBEE CRC64; MSKKPGGPGK NRAVNMLKRG MPRGLSLIGL KRAMLSLIDG KGPIRFVLAL LAFFRFTAIA PTRAVLDRWR GVNKQTAMKH LLSFKKELGT LTSAINRRST KQKKRGGTAG FTILLGLIAC AGAVTLSNFQ GKVMMTVNAT DVTDVITIPT AAGKNLCIVR AMDVGYLCED TITYECPVLA AGNDPEDIDC WCTKSSVYVR YGRCTKTRHS RRSRRSLTVQ THGESTLANK KGAWLDSTKA TRYLVKTESW ILRNPGYALV AAVIGWMLGS NTMQRVVFAI LLLLVAPAYS FNCLGMSNRD FLEGVSGATW VDLVLEGDSC VTIMSKDKPT IDVKMMNMEA ANLADVRSYC YLASVSDLST RAACPTMGEA HNEKRADPAF VCKQGVVDRG WGNGCGLFGK GSIDTCAKFA CTTKATGWII QKENIKYEVA IFVHGPTTVE SHGKIGATQA GRFSITPSAP SYTLKLGEYG EVTVDCEPRS GIDTSAYYVM SVGEKSFLVH REWFMDLNLP WSSAGSTTWR NRETLMEFEE PHATKQSVVA LGSQEGALHQ ALAGAIPVEF SSNTVKLTSG HLKCRVKMEK LQLKGTTYGV CSKAFKFART PADTGHGTVV LELQYTGTDG PCKVPISSVA SLNDLTPVGR LVTVNPFVSV ATANSKVLIE LEPPFGDSYI VVGRGEQQIN HHWHKSGSSI GKAFTTTLRG AQRLAALGDT AWDFGSVGGV FTSVGKAIHQ VFGGAFRSLF GGMSWITQGL LGALLLWMGI NARDRSIAMT FLAVGGVLLF LSVNVHADTG CAIDIGRQEL RCGSGVFIHN DVEAWMDRYK FYPETPQGLA KIIQKAHAEG VCGLRSVSRL EHQMWEAIKD ELNTLLKENG VDLSVVVEKQ NGMYKAAPKR LAATTEKLEM GWKAWGKSII FAPELANNTF VIDGPETEEC PTANRAWNSM EVEDFGFGLT STRMFLRIRE TNTTECDSKI IGTAVKNNMA VHSDLSYWIE SGLNDTWKLE RAVLGEVKSC TWPETHTLWG DGVLESDLII PITLAGPRSN HNRRPGYKTQ NQGPWDEGRV EIDFDYCPGT TVTISDSCEH RGPAARTTTE SGKLITDWCC RSCTLPPLRF QTENGCWYGM EIRPTRHDEK TLVQSRVNAY NADMIDPFQL GLMVVFLATQ EVLRKRWTAK ISIPAIMLAL LVLVFGGITY TDVLRYVILV GAAFAEANSG GDVVHLALMA TFKIQPVFLV ASFLKARWTN QESILLMLAA AFFQMAYYDA KNVLSWEVPD VLNSLSVAWM ILRAISFTNT SNVVVPLLAL LTPGLKCLNL DVYRILLLMV GVGSLIKEKR SSAAKKKGAC LICLALASTG VFNPMILAAG LMACDPNRKR GWPATEVMTA VGLMFAIVGG LAELDIDSMA IPMTIAGLMF AAFVISGKST DMWIERTADI TWESDAEITG SSERVDVRLD DDGNFQLMND PGAPWKIWML RMACLAISAY TPWAILPSVI GFWITLQYTK RGGVLWDTPS PKEYKKGDTT TGVYRIMTRG LLGSYQAGAG VMVEGVFHTL WHTTKGAALM SGEGRLDPYW GSVKEDRLCY GGPWKLQHKW NGHDEVQMIV VEPGKNVKNV QTKPGVFKTP EGEIGAVTLD YPTGTSGSPI VDKNGDVIGL YGNGVIMPNG SYISAIVQGE RMEEPAPAGF EPEMLRKKQI TVLDLHPGAG KTRKILPQII KEAINKRLRT AVLAPTRVVA AEMSEALRGL PIRYQTSAVH REHSGNEIVD VMCHATLTHR LMSPHRVPNY NLFIMDEAHF TDPASIAARG YIATKVELGE AAAIFMTATP PGTSDPFPES NAPISDMQTE IPDRAWNTGY EWITEYVGKT VWFVPSVKMG NEIALCLQRA GKKVIQLNRK SYETEYPKCK NDDWDFVYTT DISEMGANFK ASRVIDSRKS VKPTIIEEGD GRVILGEPSA ITAASAAQRR GRIGRNPSQV GDEYCYGGHT NEDDSNFAHW TEARIMLDNI NMPNGLVAQL YQPEREKCTP RTGNTGSEGK NGRTSFEFLR TADLPVWLAY KVAAAGISYH DRKWCFDGPR TNTILEDNNE VEVITKLGER KILRPRWADA RVYSDHQALK SFKDFASGKR SQIGLVEVLG RMPEHFMVKT WEALDTMYVV ATAEKGGRAH RMALEELPDA LQTIVLIALL SVMSLGVFFL LMQRKGIGKI GLGGVILGAA TFFCWMAEVP GTKIAGMLLL SLLLMIVLIP ESEKQRSQTD NQLAVFLICV LTLVGAVAAN EMGWLDKTKN DIGSLLGHRP EARETTLGVE SFLLDLRPAT AWSLYAVTTA VLTPLLKHLI TSDYINTSLT SINVQASALF TLARGFPFVD VGVSALLLAV GCWGQVTLTV TVTAAALLFC HYAYMVPGWQ AEAMRSAQRR TAAGIMKNVV VDGIVATDVP ELERTTPVMQ KKVGQIILIL VSMAAVVVNP SVRTVREAGI LTTAAAVTLW ENGASSVWNA TTAIGLCHIM RGGWLSCLSI MWTLIKNMEK PGLKRGGAKG RTLGEVWKER LNHMTKEEFT RYRKEAITEV DRSAAKHARR EGNITGGHPV SRGTAKLRWL VERRFLEPVG KVVDLGCGRG GWCYYMATQK RVQEVKGYTK GGPGHEEPQL VQSYGWNIVT MKSGVDVFYR PSEASDTLLC DIGESSSSAE VEEHRTVRVL EMVEDWLHRG PKEFCIKVLC PYMPKVIEKM ETLQRRYGGG LIRNPLSRNS THEMYWVSHA SGNIVHSVNM TSQVLLGRME KKTWKGPQFE EDVNLGSGTR AVGKPLLNSD TSKIKNRIER LKKEYSSTWH QDANHPYRTW NYHGSYEVKP TGSASSLVNG VVRLLSKPMG TITNVTTMAM TDTTPFGQQR VFKEKVDTKA PEPPEGVKYV LNETTNWLWA FLARDKKPRM CSREEFIGKV NSNAALGAMF EEQNQWKNAR EAVEDPKFWE MVDEEREAHL RGECNTCIYN MMGKREKKPG EFGKAKGSRA IWFMWLGARF LEFEALGFLN EDHWLGRKNS GGGVEGLGLQ KLGYILKEVG TKPGGKVYAD DTAGWDTRIT KADLENEAKV LELLDGEHRR LARSIIELTY RHKVVKVMRP AADGKTVMDV ISREDQRGSG QVVTYALNTF TNLAVQLVRM MEGEGVIGPD DVEKLGKGKG PKVRTWLFEN GEERLSRMAV SGDDCVVKPL DDRFATSLHF LNAMSKVRKD IQEWKPSTGW YDWQQVPFCS NHFTELIMKD GRTLVVPCRG QDELIGRARI SPGAGWNVRD TACLAKSYAQ MWLLLYFHRR DLRLMANAIC SAVPANWVPT GRTTWSIHAK GEWMTTEDML AVWNRVWIEE NEWMEDKTPV ERWSDVPYSG KREDIWCGSL IGTRTRATWA ENIHVAINQV RSVIGEEKYV DYMSSLRRYE DTIVVEDTVL //