ID POLG_WNV STANDARD; PRT; 3430 AA. AC P06935; DT 01-JAN-1988 (Rel. 06, Created) DT 15-MAR-2004 (Rel. 43, Last sequence update) DT 15-MAR-2004 (Rel. 43, Last annotation update) DE Genome polyprotein [Contains: Capsid protein C (Core protein); Matrix DE protein (Envelope protein M); Major envelope protein E; Nonstructural DE proteins NS1, NS2A, NS4A and NS4B; Flavivirin (EC 3.4.21.91) (NS2B/NS3 DE proteinase); RNA-directed RNA polymerase (EC 2.7.7.48) (NS5)]. OS West Nile virus (WN). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus. OX NCBI_TaxID=11082; RN [1] RP SEQUENCE FROM N.A. RX MEDLINE=86124703; PubMed=3753811; RA Castle E., Leidner U., Nowak T., Wengler G., Wengler G.; RT "Primary structure of the West Nile flavivirus genome region coding RT for all nonstructural proteins."; RL Virology 149:10-26(1986). RN [2] RP REVISIONS TO 1908; 2018-2036; 2242 AND 2859-2860. RX MEDLINE=21176376; PubMed=11277701; RA Yamshchikov V.F., Wengler G., Perelygin A.A., Brinton M.A., RA Compans R.W.; RT "An infectious clone of West Nile flavivirus."; RL Virology 281:294-304(2001). RN [3] RP SEQUENCE OF 1-291 FROM N.A. RX MEDLINE=85274372; PubMed=2992152; RA Castle E., Nowak T., Leidner U., Wengler G., Wengler G.; RT "Sequence analysis of the viral core protein and the RT membrane-associated proteins V1 and NV2 of the flavivirus West Nile RT virus and of the genome sequence for these proteins."; RL Virology 145:227-236(1985). RN [4] RP SEQUENCE OF 255-854 FROM N.A. RX MEDLINE=86072082; PubMed=3855247; RA Wengler G., Castle E., Leidner U., Nowak T., Wengler G.; RT "Sequence analysis of the membrane protein V3 of the flavivirus West RT Nile virus and of its gene."; RL Virology 147:264-274(1985). RN [5] RP DISULFIDE BONDS IN E PROTEIN. RX MEDLINE=87122143; PubMed=3811228; RA Nowak T., Wengler G.; RT "Analysis of disulfides present in the membrane proteins of the West RT Nile flavivirus."; RL Virology 156:127-137(1987). CC -!- FUNCTION: The small proteins NS2A, NS4A and NS4B are hydrophobic, CC suggesting a possible membrane-related function. NS5 may play a CC role in the viral RNA replication. NS3 and NS2B form a protease CC which processes the viral polyprotein into separate proteins. CC -!- CATALYTIC ACTIVITY: Selective hydrolysis of Xaa-Xaa-|-Xbb bonds in CC which each of the Xaa can be either Arg or Lys and Xbb can be CC either Ser or Ala. CC -!- CATALYTIC ACTIVITY: N nucleoside triphosphate = N diphosphate + CC {RNA}(N). CC -!- SUBUNIT: The virion of this virus is a nucleocapsid covered by a CC lipoprotein envelope. The envelope consists of two proteins: CC protein M and glycoprotein E. The nucleocapsid is a complex of CC protein C and mRNA. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M12294; AAA48498.2; -. DR PIR; A25256; GNWVWV. DR HSSP; P14336; 1SVB. DR MEROPS; S07.001; -. DR InterPro; IPR009003; Cys_Ser_trypsin. DR InterPro; IPR001410; DEAD. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR000336; Flavi_glycoprotE. DR InterPro; IPR000069; Flavi_M. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR000404; Flavi_NS4A. DR InterPro; IPR001528; Flavi_NS4B. DR InterPro; IPR000208; Flavi_NS5. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR001650; Helicase_C. DR InterPro; IPR007110; Ig-like. DR InterPro; IPR001850; Peptidase_S7. DR InterPro; IPR007095; RNA_pol_DS_PS. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR002877; RrmJ_FtsJ. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF00949; Flavi_helicase; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01350; Flavi_NS4A; 1. DR Pfam; PF01349; Flavi_NS4B; 1. DR Pfam; PF00972; Flavi_NS5; 1. DR Pfam; PF01570; Flavi_propep; 1. DR Pfam; PF01728; FtsJ; 1. DR Pfam; PF00271; helicase_C; 1. DR ProDom; PD001556; Flavi_glycoprotE; 1. DR ProDom; PD001496; Flavi_NS1; 1. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS00690; DEAH_ATP_HELICASE; FALSE_NEG. KW Polyprotein; Glycoprotein; Transferase; RNA-directed RNA polymerase; KW Core protein; Coat protein; Envelope protein; Hydrolase; Helicase; KW ATP-binding; Transmembrane; Nonstructural protein. FT INIT_MET 1 1 Removed from capsid protein C by the FT cellular aminopeptidase. FT CHAIN 1 123 CAPSID PROTEIN C. FT PROPEP 124 215 FT CHAIN 216 290 ENVELOPE GLYCOPROTEIN M. FT CHAIN 291 787 MAJOR ENVELOPE PROTEIN E. FT CHAIN 788 1139 NONSTRUCTURAL PROTEIN NS1. FT CHAIN 1140 1370 NONSTRUCTURAL PROTEIN NS2A. FT CHAIN 1371 1501 FLAVIVIRIN PROTEASE SUBUNIT NS2B. FT CHAIN 1502 2120 FLAVIVIRIN PROTEASE SUBUNIT NS3. FT CHAIN 2121 2269 NONSTRUCTURAL PROTEIN NS4A. FT CHAIN 2270 2525 NONSTRUCTURAL PROTEIN NS4B. FT CHAIN 2526 3430 RNA-DIRECTED RNA POLYMERASE (NS5). FT DOMAIN 388 401 INVOLVED IN FUSION. FT NP_BIND 1695 1702 ATP (Potential). FT ACT_SITE 1552 1552 CHARGE RELAY SYSTEM (BY SIMILARITY). FT ACT_SITE 1576 1576 CHARGE RELAY SYSTEM (BY SIMILARITY). FT ACT_SITE 1636 1636 CHARGE RELAY SYSTEM (BY SIMILARITY). FT SITE 1786 1789 DEAH BOX. FT DISULFID 293 320 FT DISULFID 350 406 FT DISULFID 364 395 FT DISULFID 382 411 FT DISULFID 476 574 FT DISULFID 591 622 FT CARBOHYD 138 138 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 917 917 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 962 962 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 994 994 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 1289 1289 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 2336 2336 N-LINKED (GLCNAC...) (POTENTIAL). FT CARBOHYD 2489 2489 N-LINKED (GLCNAC...) (POTENTIAL). SQ SEQUENCE 3430 AA; 380104 MW; 42D71B7CB12DC45B CRC64; MSKKPGGPGK NRAVNMLKRG MPRGLSLIGL KRAMLSLIDG KGPIRFVLAL LAFFRFTAIA PTRAVLDRWR GVNKQTAMKH LLSFKKELGT LTSAINRRST KQKKRGGTAG FTILLGLIAC AGAVTLSNFQ GKVMMTVNAT DVTDVITIPT AAGKNLCIVR AMDVGYLCED TITYECPVLA AGNDPEDIDC WCTKSSVYVR YGRCTKTRHS RRSRRSLTVQ THGESTLANK KGAWLDSTKA TRYLVKTESW ILRNPGYALV AAVIGWMLGS NTMQRVVFAI LLLLVAPAYS FNCLGMSNRD FLEGVSGATW VDLVLEGDSC VTIMSKDKPT IDVKMMNMEA ANLADVRSYC YLASVSDLST RAACPTMGEA HNEKRADPAF VCKQGVVDRG WGNGCGLFGK GSIDTCAKFA CTTKATGWII QKENIKYEVA IFVHGPTTVE SHGKIGATQA GRFSITPSAP SYTLKLGEYG EVTVDCEPRS GIDTSAYYVM SVGEKSFLVH REWFMDLNLP WSSAGSTTWR NRETLMEFEE PHATKQSVVA LGSQEGALHQ ALAGAIPVEF SSNTVKLTSG HLKCRVKMEK LQLKGTTYGV CSKAFKFART PADTGHGTVV LELQYTGTDG PCKVPISSVA SLNDLTPVGR LVTVNPFVSV ATANSKVLIE LEPPFGDSYI VVGRGEQQIN HHWHKSGSSI GKAFTTTLRG AQRLAALGDT AWDFGSVGGV FTSVGKAIHQ VFGGAFRSLF GGMSWITQGL LGALLLWMGI NARDRSIAMT FLAVGGVLLF LSVNVHADTG CAIDIGRQEL RCGSGVFIHN DVEAWMDRYK FYPETPQGLA KIIQKAHAEG VCGLRSVSRL EHQMWEAIKD ELNTLLKENG VDLSVVVEKQ NGMYKAAPKR LAATTEKLEM GWKAWGKSII FAPELANNTF VIDGPETEEC PTANRAWNSM EVEDFGFGLT STRMFLRIRE TNTTECDSKI IGTAVKNNMA VHSDLSYWIE SGLNDTWKLE RAVLGEVKSC TWPETHTLWG DGVLESDLII PITLAGPRSN HNRRPGYKTQ NQGPWDEGRV EIDFDYCPGT TVTISDSCEH RGPAARTTTE SGKLITDWCC RSCTLPPLRF QTENGCWYGM EIRPTRHDEK TLVQSRVNAY NADMIDPFQL GLMVVFLATQ EVLRKRWTAK ISIPAIMLAL LVLVFGGITY TDVLRYVILV GAAFAEANSG GDVVHLALMA TFKIQPVFLV ASFLKARWTN QESILLMLAA AFFQMAYYDA KNVLSWEVPD VLNSLSVAWM ILRAISFTNT SNVVVPLLAL LTPGLKCLNL DVYRILLLMV GVGSLIKEKR SSAAKKKGAC LICLALASTG VFNPMILAAG LMACDPNRKR GWPATEVMTA VGLMFAIVGG LAELDIDSMA IPMTIAGLMF AAFVISGKST DMWIERTADI TWESDAEITG SSERVDVRLD DDGNFQLMND PGAPWKIWML RMACLAISAY TPWAILPSVI GFWITLQYTK RGGVLWDTPS PKEYKKGDTT TGVYRIMTRG LLGSYQAGAG VMVEGVFHTL WHTTKGAALM SGEGRLDPYW GSVKEDRLCY GGPWKLQHKW NGHDEVQMIV VEPGKNVKNV QTKPGVFKTP EGEIGAVTLD YPTGTSGSPI VDKNGDVIGL YGNGVIMPNG SYISAIVQGE RMEEPAPAGF EPEMLRKKQI TVLDLHPGAG KTRKILPQII KEAINKRLRT AVLAPTRVVA AEMSEALRGL PIRYQTSAVH REHSGNEIVD VMCHATLTHR LMSPHRVPNY NLFIMDEAHF TDPASIAARG YIATKVELGE AAAIFMTATP PGTSDPFPES NAPISDMQTE IPDRAWNTGY EWITEYVGKT VWFVPSVKMG NEIALCLQRA GKKVIQLNRK SYETEYPKCK NDDWDFVITT DISEMGANFK ASRVIDSRKS VKPTIIEEGD GRVILGEPSA ITAASAAQRR GRIGRNPSQV GDEYCYGGHT NEDDSNFAHW TEARIMLDNI NMPNGLVAQL YQPEREKVYT MDGEYRLRGE ERKNFLEFLR TADLPVWLAY KVAAAGISYH DRKWCFDGPR TNTILEDNNE VEVITKLGER KILRPRWADA RVYSDHQALK SFKDFASGKR SQIGLVEVLG RMPEHFMVKT WEALDTMYVV ATAEKGGRAH RMALEELPDA LQTIVLIALL SVMSLGVFFL LMQRKGIGKI GLGGVILGAA TFFCWMAEVP GTKIAGMLLL SLLLMIVLIP EPEKQRSQTD NQLAVFLICV LTLVGAVAAN EMGWLDKTKN DIGSLLGHRP EARETTLGVE SFLLDLRPAT AWSLYAVTTA VLTPLLKHLI TSDYINTSLT SINVQASALF TLARGFPFVD VGVSALLLAV GCWGQVTLTV TVTAAALLFC HYAYMVPGWQ AEAMRSAQRR TAAGIMKNVV VDGIVATDVP ELERTTPVMQ KKVGQIILIL VSMAAVVVNP SVRTVREAGI LTTAAAVTLW ENGASSVWNA TTAIGLCHIM RGGWLSCLSI MWTLIKNMEK PGLKRGGAKG RTLGEVWKER LNHMTKEEFT RYRKEAITEV DRSAAKHARR EGNITGGHPV SRGTAKLRWL VERRFLEPVG KVVDLGCGRG GWCYYMATQK RVQEVKGYTK GGPGHEEPQL VQSYGWNIVT MKSGVDVFYR PSEASDTLLC DIGESSSSAE VEEHRTVRVL EMVEDWLHRG PKEFCIKVLC PYMPKVIEKM ETLQRRYGGG LIRNPLSRNS THEMYWVSHA SGNIVHSVNM TSQVLLGRME KKTWKGPQFE EDVNLGSGTR AVGKPLLNSD TSKIKNRIER LKKEYSSTWH QDANHPYRTW NYHGSYEVKP TGSASSLVNG VVRLLSKPWD TITNVTTMAM TDTTPFGQQR VFKEKVDTKA PEPPEGVKYV LNETTNWLWA FLARDKKPRM CSREEFIGKV NSNAALGAMF EEQNQWKNAR EAVEDPKFWE MVDEEREAHL RGECNTCIYN MMGKREKKPG EFGKAKGSRA IWFMWLGARF LEFEALGFLN EDHWLGRKNS GGGVEGLGLQ KLGYILKEVG TKPGGKVYAD DTAGWDTRIT KADLENEAKV LELLDGEHRR LARSIIELTY RHKVVKVMRP AADGKTVMDV ISREDQRGSG QVVTYALNTF TNLAVQLVRM MEGEGVIGPD DVEKLGKGKG PKVRTWLFEN GEERLSRMAV SGDDCVVKPL DDRFATSLHF LNAMSKVRKD IQEWKPSTGW YDWQQVPFCS NHFTELIMKD GRTLVVPCRG QDELIGRARI SPGAGWNVRD TACLAKSYAQ MWLLLYFHRR DLRLMANAIC SAVPANWVPT GRTTWSIHAK GEWMTTEDML AVWNRVWIEE NEWMEDKTPV ERWSDVPYSG KREDIWCGSL IGTRTRATWA ENIHVAINQV RSVIGEEKYV DYMSSLRRYE DTIVVEDTVL //