ID POLG_WNV Reviewed; 3430 AA. AC P06935; DT 01-JAN-1988, integrated into UniProtKB/Swiss-Prot. DT 24-OCT-2003, sequence version 2. DT 15-JAN-2008, entry version 87. DE Genome polyprotein [Contains: Capsid protein C (Core protein); DE Envelope protein M (Matrix protein); Major envelope protein E; Non- DE structural protein 1 (NS1); Non-structural protein 2A (NS2A); DE Flavivirin protease NS2B regulatory subunit; Flavivirin protease NS3 DE catalytic subunit (EC 3.4.21.91); Non-structural protein 4A (NS4A); DE Non-structural protein 4B (NS4B); RNA-directed RNA polymerase DE (EC 2.7.7.48) (NS5)]. OS West Nile virus (WNV). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus; Japanese encephalitis virus group. OX NCBI_TaxID=11082; OH NCBI_TaxID=7158; Aedes. OH NCBI_TaxID=34610; Amblyomma variegatum. OH NCBI_TaxID=8782; Aves. OH NCBI_TaxID=53527; Culex. OH NCBI_TaxID=9606; Homo sapiens (Human). OH NCBI_TaxID=34627; Hyalomma marginatum. OH NCBI_TaxID=308735; Mansonia uniformis. OH NCBI_TaxID=308737; Mimomyia. OH NCBI_TaxID=34630; Rhipicephalus. RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=86124703; PubMed=3753811; DOI=10.1016/0042-6822(86)90082-6; RA Castle E., Leidner U., Nowak T., Wengler G., Wengler G.; RT "Primary structure of the West Nile flavivirus genome region coding RT for all nonstructural proteins."; RL Virology 149:10-26(1986). RN [2] RP SEQUENCE REVISION TO 1908; 2018-2036; 2242 AND 2859-2860. RX MEDLINE=21176376; PubMed=11277701; DOI=10.1006/viro.2000.0795; RA Yamshchikov V.F., Wengler G., Perelygin A.A., Brinton M.A., RA Compans R.W.; RT "An infectious clone of the West Nile flavivirus."; RL Virology 281:294-304(2001). RN [3] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA] OF 1-291. RX MEDLINE=85274372; PubMed=2992152; DOI=10.1016/0042-6822(85)90156-4; RA Castle E., Nowak T., Leidner U., Wengler G., Wengler G.; RT "Sequence analysis of the viral core protein and the membrane- RT associated proteins V1 and NV2 of the flavivirus West Nile virus and RT of the genome sequence for these proteins."; RL Virology 145:227-236(1985). RN [4] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA] OF 255-854. RX MEDLINE=86072082; PubMed=3855247; DOI=10.1016/0042-6822(85)90129-1; RA Wengler G., Castle E., Leidner U., Nowak T., Wengler G.; RT "Sequence analysis of the membrane protein V3 of the flavivirus West RT Nile virus and of its gene."; RL Virology 147:264-274(1985). RN [5] RP DISULFIDE BONDS IN E PROTEIN. RX MEDLINE=87122143; PubMed=3811228; DOI=10.1016/0042-6822(87)90443-0; RA Nowak T., Wengler G.; RT "Analysis of disulfides present in the membrane proteins of the West RT Nile flavivirus."; RL Virology 156:127-137(1987). CC -!- FUNCTION: The small proteins NS2A, NS4A and NS4B are hydrophobic, CC suggesting a possible membrane-related function. NS5 may play a CC role in the viral RNA replication. The NS2B/NS3 protease complex CC processes the viral polyprotein. CC -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds CC in which each of the Xaa can be either Arg or Lys and Yaa can be CC either Ser or Ala. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- SUBUNIT: NS3 and NS2B form a heterodimer. NS3 is the catalytic CC subunit, whereas NS2B strongly stimulates the latter (By CC similarity). CC -!- INTERACTION: CC P05106:ITGB3 (xeno); NbExp=2; IntAct=EBI-981051, EBI-702847; CC -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins CC (By similarity). CC -!- MISCELLANEOUS: The virion of this virus is a nucleocapsid covered CC by a lipoprotein envelope. The envelope contains two proteins: the CC protein M and glycoprotein E. The nucleocapsid is a complex of CC protein C and mRNA. In immature particles, there are 60 CC icosaedrally organized trimeric spikes on the surface. Each spike CC consists of three heterodimers of envelope protein M precursor CC (prM) and envelope protein E (By similarity). CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase S7 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M12294; AAA48498.2; -; Genomic_RNA. DR PIR; A25256; GNWVWV. DR RefSeq; NP_041724.2; -. DR PDB; 2FP7; X-ray; 1.68 A; A=1420-1466, B=1517-1688. DR PDB; 2G05; Model; -; D=1675-2120. DR PDBsum; 2FP7; -. DR PDBsum; 2G05; -. DR SMR; P06935; 25-97, 291-686, 1511-1679, 2531-2792. DR IntAct; P06935; -. DR GeneID; 912267; -. DR GO; GO:0005515; F:protein binding; IPI:IntAct. DR InterPro; IPR014001; DEAD-like_N. DR InterPro; IPR011492; DEAD_Flavivir. DR InterPro; IPR002464; DNA/RNA_helicase_ATP-dep_DEAH. DR InterPro; IPR001650; DNA/RNA_helicase_C. DR InterPro; IPR013756; Flav_glyE_cen_2. DR InterPro; IPR011999; Flav_glyE_cen_dm. DR InterPro; IPR013754; Flav_glyE_dim. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR000069; Flavi_M. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR000404; Flavi_NS4A. DR InterPro; IPR001528; Flavi_NS4B. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR000336; Flv_glyE_Ig-like. DR InterPro; IPR014412; Gen_Poly_FLV. DR InterPro; IPR014021; Helicase_SF1/SF2_ATP-bd. DR InterPro; IPR001850; Peptidase_S7. DR InterPro; IPR000208; RNA_pol_flaviviral. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR002877; RrmJFtsJ_mtfrase. DR Gene3D; G3DSA:3.30.67.10; Flav_glyE_cen_2; 1. DR Gene3D; G3DSA:2.60.98.10; Flav_glyE_dim; 1. DR Gene3D; G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF07652; Flavi_DEAD; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01350; Flavi_NS4A; 1. DR Pfam; PF01349; Flavi_NS4B; 1. DR Pfam; PF00972; Flavi_NS5; 1. DR Pfam; PF01570; Flavi_propep; 1. DR Pfam; PF01728; FtsJ; 1. DR Pfam; PF00271; Helicase_C; 1. DR Pfam; PF00949; Peptidase_S7; 1. DR PIRSF; PIRSF003817; Gen_Poly_FLV; 1. DR ProDom; PD001496; Flavi_NS1; 1. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS00690; DEAH_ATP_HELICASE; FALSE_NEG. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; 1. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. PE 1: Evidence at protein level; KW 3D-structure; ATP-binding; Capsid protein; KW Cleavage on pair of basic residues; Complete proteome; Core protein; KW Envelope protein; Glycoprotein; Helicase; Hydrolase; Membrane; KW Nucleotide-binding; Nucleotidyltransferase; RNA replication; KW RNA-directed RNA polymerase; Transferase; Transmembrane; Virion. FT INIT_MET 1 1 Removed; by host. FT CHAIN 2 123 Capsid protein C. FT /FTId=PRO_0000037743. FT PROPEP 124 215 FT /FTId=PRO_0000037744. FT CHAIN 216 290 Envelope protein M. FT /FTId=PRO_0000037745. FT CHAIN 291 787 Major envelope protein E. FT /FTId=PRO_0000037746. FT CHAIN 788 1139 Non-structural protein 1. FT /FTId=PRO_0000037747. FT CHAIN 1140 1370 Non-structural protein 2A. FT /FTId=PRO_0000037748. FT CHAIN 1371 1501 Flavivirin protease NS2B regulatory FT subunit. FT /FTId=PRO_0000037749. FT CHAIN 1502 2120 Flavivirin protease NS3 catalytic FT subunit. FT /FTId=PRO_0000037750. FT CHAIN 2121 2269 Non-structural protein 4A. FT /FTId=PRO_0000037751. FT CHAIN 2270 2525 Non-structural protein 4B. FT /FTId=PRO_0000037752. FT CHAIN 2526 3430 RNA-directed RNA polymerase. FT /FTId=PRO_0000037753. FT DOMAIN 1508 1679 Peptidase S7. FT DOMAIN 1682 1838 Helicase ATP-binding. FT DOMAIN 1849 2014 Helicase C-terminal. FT DOMAIN 3055 3207 RdRp catalytic. FT NP_BIND 1695 1702 ATP (Potential). FT REGION 388 401 Involved in fusion. FT MOTIF 1786 1789 DEAH box. FT ACT_SITE 1552 1552 Charge relay system (By similarity). FT ACT_SITE 1576 1576 Charge relay system (By similarity). FT ACT_SITE 1636 1636 Charge relay system (By similarity). FT CARBOHYD 138 138 N-linked (GlcNAc...) (Potential). FT CARBOHYD 917 917 N-linked (GlcNAc...) (Potential). FT CARBOHYD 962 962 N-linked (GlcNAc...) (Potential). FT CARBOHYD 994 994 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1289 1289 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2336 2336 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2489 2489 N-linked (GlcNAc...) (Potential). FT DISULFID 293 320 FT DISULFID 350 406 FT DISULFID 364 395 FT DISULFID 382 411 FT DISULFID 476 574 FT DISULFID 591 622 FT STRAND 1423 1428 FT STRAND 1444 1449 FT STRAND 1455 1457 FT STRAND 1522 1527 FT STRAND 1536 1543 FT STRAND 1546 1550 FT HELIX 1551 1554 FT STRAND 1559 1561 FT STRAND 1564 1566 FT STRAND 1568 1572 FT TURN 1573 1576 FT STRAND 1577 1583 FT STRAND 1592 1594 FT STRAND 1596 1600 FT STRAND 1608 1612 FT STRAND 1615 1619 FT STRAND 1622 1627 FT HELIX 1633 1635 FT STRAND 1639 1641 FT STRAND 1647 1651 FT STRAND 1654 1656 FT STRAND 1662 1665 FT STRAND 1690 1694 FT TURN 1701 1704 FT HELIX 1705 1715 FT STRAND 1720 1726 FT HELIX 1727 1736 FT STRAND 1741 1744 FT STRAND 1759 1763 FT HELIX 1764 1772 FT STRAND 1773 1775 FT STRAND 1781 1786 FT TURN 1787 1789 FT HELIX 1793 1808 FT STRAND 1812 1819 FT STRAND 1835 1838 FT HELIX 1851 1855 FT STRAND 1860 1863 FT HELIX 1867 1878 FT TURN 1879 1881 FT STRAND 1884 1887 FT TURN 1889 1891 FT HELIX 1892 1900 FT STRAND 1905 1909 FT HELIX 1911 1913 FT STRAND 1921 1926 FT STRAND 1929 1936 FT STRAND 1938 1940 FT STRAND 1942 1950 FT HELIX 1953 1960 FT STRAND 1973 1976 FT HELIX 1989 1997 FT HELIX 2012 2014 FT TURN 2015 2017 FT HELIX 2029 2041 FT HELIX 2046 2054 FT HELIX 2063 2065 FT HELIX 2070 2072 FT STRAND 2082 2084 FT STRAND 2090 2092 FT STRAND 2096 2099 FT HELIX 2100 2102 FT HELIX 2106 2116 SQ SEQUENCE 3430 AA; 380110 MW; 42D71B7CB12DC45B CRC64; MSKKPGGPGK NRAVNMLKRG MPRGLSLIGL KRAMLSLIDG KGPIRFVLAL LAFFRFTAIA PTRAVLDRWR GVNKQTAMKH LLSFKKELGT LTSAINRRST KQKKRGGTAG FTILLGLIAC AGAVTLSNFQ GKVMMTVNAT DVTDVITIPT AAGKNLCIVR AMDVGYLCED TITYECPVLA AGNDPEDIDC WCTKSSVYVR YGRCTKTRHS RRSRRSLTVQ THGESTLANK KGAWLDSTKA TRYLVKTESW ILRNPGYALV AAVIGWMLGS NTMQRVVFAI LLLLVAPAYS FNCLGMSNRD FLEGVSGATW VDLVLEGDSC VTIMSKDKPT IDVKMMNMEA ANLADVRSYC YLASVSDLST RAACPTMGEA HNEKRADPAF VCKQGVVDRG WGNGCGLFGK GSIDTCAKFA CTTKATGWII QKENIKYEVA IFVHGPTTVE SHGKIGATQA GRFSITPSAP SYTLKLGEYG EVTVDCEPRS GIDTSAYYVM SVGEKSFLVH REWFMDLNLP WSSAGSTTWR NRETLMEFEE PHATKQSVVA LGSQEGALHQ ALAGAIPVEF SSNTVKLTSG HLKCRVKMEK LQLKGTTYGV CSKAFKFART PADTGHGTVV LELQYTGTDG PCKVPISSVA SLNDLTPVGR LVTVNPFVSV ATANSKVLIE LEPPFGDSYI VVGRGEQQIN HHWHKSGSSI GKAFTTTLRG AQRLAALGDT AWDFGSVGGV FTSVGKAIHQ VFGGAFRSLF GGMSWITQGL LGALLLWMGI NARDRSIAMT FLAVGGVLLF LSVNVHADTG CAIDIGRQEL RCGSGVFIHN DVEAWMDRYK FYPETPQGLA KIIQKAHAEG VCGLRSVSRL EHQMWEAIKD ELNTLLKENG VDLSVVVEKQ NGMYKAAPKR LAATTEKLEM GWKAWGKSII FAPELANNTF VIDGPETEEC PTANRAWNSM EVEDFGFGLT STRMFLRIRE TNTTECDSKI IGTAVKNNMA VHSDLSYWIE SGLNDTWKLE RAVLGEVKSC TWPETHTLWG DGVLESDLII PITLAGPRSN HNRRPGYKTQ NQGPWDEGRV EIDFDYCPGT TVTISDSCEH RGPAARTTTE SGKLITDWCC RSCTLPPLRF QTENGCWYGM EIRPTRHDEK TLVQSRVNAY NADMIDPFQL GLMVVFLATQ EVLRKRWTAK ISIPAIMLAL LVLVFGGITY TDVLRYVILV GAAFAEANSG GDVVHLALMA TFKIQPVFLV ASFLKARWTN QESILLMLAA AFFQMAYYDA KNVLSWEVPD VLNSLSVAWM ILRAISFTNT SNVVVPLLAL LTPGLKCLNL DVYRILLLMV GVGSLIKEKR SSAAKKKGAC LICLALASTG VFNPMILAAG LMACDPNRKR GWPATEVMTA VGLMFAIVGG LAELDIDSMA IPMTIAGLMF AAFVISGKST DMWIERTADI TWESDAEITG SSERVDVRLD DDGNFQLMND PGAPWKIWML RMACLAISAY TPWAILPSVI GFWITLQYTK RGGVLWDTPS PKEYKKGDTT TGVYRIMTRG LLGSYQAGAG VMVEGVFHTL WHTTKGAALM SGEGRLDPYW GSVKEDRLCY GGPWKLQHKW NGHDEVQMIV VEPGKNVKNV QTKPGVFKTP EGEIGAVTLD YPTGTSGSPI VDKNGDVIGL YGNGVIMPNG SYISAIVQGE RMEEPAPAGF EPEMLRKKQI TVLDLHPGAG KTRKILPQII KEAINKRLRT AVLAPTRVVA AEMSEALRGL PIRYQTSAVH REHSGNEIVD VMCHATLTHR LMSPHRVPNY NLFIMDEAHF TDPASIAARG YIATKVELGE AAAIFMTATP PGTSDPFPES NAPISDMQTE IPDRAWNTGY EWITEYVGKT VWFVPSVKMG NEIALCLQRA GKKVIQLNRK SYETEYPKCK NDDWDFVITT DISEMGANFK ASRVIDSRKS VKPTIIEEGD GRVILGEPSA ITAASAAQRR GRIGRNPSQV GDEYCYGGHT NEDDSNFAHW TEARIMLDNI NMPNGLVAQL YQPEREKVYT MDGEYRLRGE ERKNFLEFLR TADLPVWLAY KVAAAGISYH DRKWCFDGPR TNTILEDNNE VEVITKLGER KILRPRWADA RVYSDHQALK SFKDFASGKR SQIGLVEVLG RMPEHFMVKT WEALDTMYVV ATAEKGGRAH RMALEELPDA LQTIVLIALL SVMSLGVFFL LMQRKGIGKI GLGGVILGAA TFFCWMAEVP GTKIAGMLLL SLLLMIVLIP EPEKQRSQTD NQLAVFLICV LTLVGAVAAN EMGWLDKTKN DIGSLLGHRP EARETTLGVE SFLLDLRPAT AWSLYAVTTA VLTPLLKHLI TSDYINTSLT SINVQASALF TLARGFPFVD VGVSALLLAV GCWGQVTLTV TVTAAALLFC HYAYMVPGWQ AEAMRSAQRR TAAGIMKNVV VDGIVATDVP ELERTTPVMQ KKVGQIILIL VSMAAVVVNP SVRTVREAGI LTTAAAVTLW ENGASSVWNA TTAIGLCHIM RGGWLSCLSI MWTLIKNMEK PGLKRGGAKG RTLGEVWKER LNHMTKEEFT RYRKEAITEV DRSAAKHARR EGNITGGHPV SRGTAKLRWL VERRFLEPVG KVVDLGCGRG GWCYYMATQK RVQEVKGYTK GGPGHEEPQL VQSYGWNIVT MKSGVDVFYR PSEASDTLLC DIGESSSSAE VEEHRTVRVL EMVEDWLHRG PKEFCIKVLC PYMPKVIEKM ETLQRRYGGG LIRNPLSRNS THEMYWVSHA SGNIVHSVNM TSQVLLGRME KKTWKGPQFE EDVNLGSGTR AVGKPLLNSD TSKIKNRIER LKKEYSSTWH QDANHPYRTW NYHGSYEVKP TGSASSLVNG VVRLLSKPWD TITNVTTMAM TDTTPFGQQR VFKEKVDTKA PEPPEGVKYV LNETTNWLWA FLARDKKPRM CSREEFIGKV NSNAALGAMF EEQNQWKNAR EAVEDPKFWE MVDEEREAHL RGECNTCIYN MMGKREKKPG EFGKAKGSRA IWFMWLGARF LEFEALGFLN EDHWLGRKNS GGGVEGLGLQ KLGYILKEVG TKPGGKVYAD DTAGWDTRIT KADLENEAKV LELLDGEHRR LARSIIELTY RHKVVKVMRP AADGKTVMDV ISREDQRGSG QVVTYALNTF TNLAVQLVRM MEGEGVIGPD DVEKLGKGKG PKVRTWLFEN GEERLSRMAV SGDDCVVKPL DDRFATSLHF LNAMSKVRKD IQEWKPSTGW YDWQQVPFCS NHFTELIMKD GRTLVVPCRG QDELIGRARI SPGAGWNVRD TACLAKSYAQ MWLLLYFHRR DLRLMANAIC SAVPANWVPT GRTTWSIHAK GEWMTTEDML AVWNRVWIEE NEWMEDKTPV ERWSDVPYSG KREDIWCGSL IGTRTRATWA ENIHVAINQV RSVIGEEKYV DYMSSLRRYE DTIVVEDTVL //