ID POLG_WNV Reviewed; 3430 AA. AC P06935; DT 01-JAN-1988, integrated into UniProtKB/Swiss-Prot. DT 24-OCT-2003, sequence version 2. DT 25-NOV-2008, entry version 95. DE RecName: Full=Genome polyprotein; DE Contains: DE RecName: Full=Protein C; DE AltName: Full=Core protein; DE AltName: Full=Capsid protein; DE Contains: DE RecName: Full=Small envelope protein M; DE AltName: Full=Matrix protein; DE Contains: DE RecName: Full=Envelope protein E; DE Contains: DE RecName: Full=Non-structural protein 1; DE Short=NS1; DE Contains: DE RecName: Full=Non-structural protein 2A; DE Short=NS2A; DE Contains: DE RecName: Full=Flavivirin protease NS2B regulatory subunit; DE Contains: DE RecName: Full=Flavivirin protease NS3 catalytic subunit; DE EC=3.4.21.91; DE Contains: DE RecName: Full=Non-structural protein 4A; DE Short=NS4A; DE Contains: DE RecName: Full=Non-structural protein 4B; DE Short=NS4B; DE Contains: DE RecName: Full=RNA-directed RNA polymerase; DE EC=2.7.7.48; DE AltName: Full=NS5; OS West Nile virus (WNV). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus; Japanese encephalitis virus group. OX NCBI_TaxID=11082; OH NCBI_TaxID=7158; Aedes. OH NCBI_TaxID=34610; Amblyomma variegatum (Tropical bont tick). OH NCBI_TaxID=8782; Aves. OH NCBI_TaxID=53527; Culex. OH NCBI_TaxID=9606; Homo sapiens (Human). OH NCBI_TaxID=34627; Hyalomma marginatum. OH NCBI_TaxID=308735; Mansonia uniformis. OH NCBI_TaxID=308737; Mimomyia. OH NCBI_TaxID=34630; Rhipicephalus. RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=86124703; PubMed=3753811; DOI=10.1016/0042-6822(86)90082-6; RA Castle E., Leidner U., Nowak T., Wengler G., Wengler G.; RT "Primary structure of the West Nile flavivirus genome region coding RT for all nonstructural proteins."; RL Virology 149:10-26(1986). RN [2] RP SEQUENCE REVISION TO 1908; 2018-2036; 2242 AND 2859-2860. RX MEDLINE=21176376; PubMed=11277701; DOI=10.1006/viro.2000.0795; RA Yamshchikov V.F., Wengler G., Perelygin A.A., Brinton M.A., RA Compans R.W.; RT "An infectious clone of the West Nile flavivirus."; RL Virology 281:294-304(2001). RN [3] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA] OF 1-291. RX MEDLINE=85274372; PubMed=2992152; DOI=10.1016/0042-6822(85)90156-4; RA Castle E., Nowak T., Leidner U., Wengler G., Wengler G.; RT "Sequence analysis of the viral core protein and the membrane- RT associated proteins V1 and NV2 of the flavivirus West Nile virus and RT of the genome sequence for these proteins."; RL Virology 145:227-236(1985). RN [4] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA] OF 255-854. RX MEDLINE=86072082; PubMed=3855247; DOI=10.1016/0042-6822(85)90129-1; RA Wengler G., Castle E., Leidner U., Nowak T., Wengler G.; RT "Sequence analysis of the membrane protein V3 of the flavivirus West RT Nile virus and of its gene."; RL Virology 147:264-274(1985). RN [5] RP DISULFIDE BONDS IN E PROTEIN. RX MEDLINE=87122143; PubMed=3811228; DOI=10.1016/0042-6822(87)90443-0; RA Nowak T., Wengler G.; RT "Analysis of disulfides present in the membrane proteins of the West RT Nile flavivirus."; RL Virology 156:127-137(1987). CC -!- FUNCTION: The small proteins NS2A, NS4A and NS4B are hydrophobic, CC suggesting a possible membrane-related function. NS5 may play a CC role in the viral RNA replication. The NS2B/NS3 protease complex CC processes the viral polyprotein. CC -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds CC in which each of the Xaa can be either Arg or Lys and Yaa can be CC either Ser or Ala. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- SUBUNIT: NS3 and NS2B form a heterodimer. NS3 is the catalytic CC subunit, whereas NS2B strongly stimulates the latter (By CC similarity). CC -!- INTERACTION: CC P05106:ITGB3 (xeno); NbExp=2; IntAct=EBI-981051, EBI-702847; CC -!- SUBCELLULAR LOCATION: Protein C: Virion (Potential). Membrane; CC Single-pass membrane protein (Potential). CC -!- SUBCELLULAR LOCATION: Small envelope protein M: Virion CC (Potential). Membrane; Single-pass membrane protein (Potential). CC -!- SUBCELLULAR LOCATION: Envelope protein E: Virion (Potential). CC Membrane; Multi-pass membrane protein (Potential). CC -!- SUBCELLULAR LOCATION: Non-structural protein 4A: Membrane; Single- CC pass membrane protein (Potential). CC -!- SUBCELLULAR LOCATION: Non-structural protein 4B: Membrane; Multi- CC pass membrane protein (Potential). CC -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins CC (By similarity). CC -!- MISCELLANEOUS: The virion of this virus is a nucleocapsid covered CC by a lipoprotein envelope. The envelope contains two proteins: the CC protein M and glycoprotein E. The nucleocapsid is a complex of CC protein C and mRNA. In immature particles, there are 60 CC icosaedrally organized trimeric spikes on the surface. Each spike CC consists of three heterodimers of envelope protein M precursor CC (prM) and envelope protein E (By similarity). CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase S7 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M12294; AAA48498.2; -; Genomic_RNA. DR PIR; A25256; GNWVWV. DR RefSeq; NP_041724.2; -. DR PDB; 2FP7; X-ray; 1.68 A; A=1420-1466, B=1517-1688. DR PDB; 2G05; Model; -; D=1675-2120. DR PDB; 2G2G; Model; -; D=1675-2120. DR PDB; 2GGV; X-ray; 1.80 A; A=1419-1463, B=1503-1679. DR PDB; 2IJO; X-ray; 2.30 A; A=1423-1469, B=1502-1686. DR PDB; 2P5P; X-ray; 2.80 A; A/B/C=585-701. DR PDBsum; 2FP7; -. DR PDBsum; 2G05; -. DR PDBsum; 2G2G; -. DR PDBsum; 2GGV; -. DR PDBsum; 2IJO; -. DR PDBsum; 2P5P; -. DR SMR; P06935; 25-97, 291-686, 1511-1679, 2531-2792. DR IntAct; P06935; -. DR GeneID; 912267; -. DR GO; GO:0016021; C:integral to membrane; IEA:InterPro. DR GO; GO:0019031; C:viral envelope; IEA:InterPro. DR GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW. DR GO; GO:0005524; F:ATP binding; IEA:InterPro. DR GO; GO:0008026; F:ATP-dependent helicase activity; IEA:InterPro. DR GO; GO:0003725; F:double-stranded RNA binding; IEA:InterPro. DR GO; GO:0005515; F:protein binding; IPI:IntAct. DR GO; GO:0003724; F:RNA helicase activity; IEA:InterPro. DR GO; GO:0003968; F:RNA-directed RNA polymerase activity; IEA:InterPro. DR GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro. DR GO; GO:0005198; F:structural molecule activity; IEA:InterPro. DR GO; GO:0016070; P:RNA metabolic process; IEA:InterPro. DR GO; GO:0006410; P:transcription, RNA-dependent; IEA:UniProtKB-KW. DR GO; GO:0019079; P:viral genome replication; IEA:InterPro. DR InterPro; IPR014001; DEAD-like_N. DR InterPro; IPR011492; DEAD_Flavivir. DR InterPro; IPR001650; DNA/RNA_helicase_C. DR InterPro; IPR002464; DNA/RNA_helicase_DEAH_CS. DR InterPro; IPR013756; Flav_glyE_cen_2. DR InterPro; IPR011999; Flav_glyE_cen_dm. DR InterPro; IPR013754; Flav_glyE_dim. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR000069; Flavi_M. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR000404; Flavi_NS4A. DR InterPro; IPR001528; Flavi_NS4B. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR000336; Flv_glyE_Ig-like. DR InterPro; IPR014412; Gen_Poly_FLV. DR InterPro; IPR014021; Helicase_SF1/SF2_ATP-bd. DR InterPro; IPR001850; Peptidase_S7. DR InterPro; IPR000208; RNA_pol_flaviviral. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR002877; RrmJFtsJ_MeTrfase. DR Gene3D; G3DSA:3.30.67.10; Flav_glyE_cen_2; 1. DR Gene3D; G3DSA:2.60.98.10; Flav_glyE_dim; 1. DR Gene3D; G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF07652; Flavi_DEAD; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01350; Flavi_NS4A; 1. DR Pfam; PF01349; Flavi_NS4B; 1. DR Pfam; PF00972; Flavi_NS5; 1. DR Pfam; PF01570; Flavi_propep; 1. DR Pfam; PF01728; FtsJ; 1. DR Pfam; PF00271; Helicase_C; 1. DR Pfam; PF00949; Peptidase_S7; 1. DR PIRSF; PIRSF003817; Gen_Poly_FLV; 1. DR ProDom; PD001496; Flavi_NS1; 1. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS00690; DEAH_ATP_HELICASE; FALSE_NEG. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; 1. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. PE 1: Evidence at protein level; KW 3D-structure; ATP-binding; Capsid protein; KW Cleavage on pair of basic residues; Complete proteome; Core protein; KW Envelope protein; Glycoprotein; Helicase; Hydrolase; Membrane; KW Nucleotide-binding; Nucleotidyltransferase; RNA replication; KW RNA-directed RNA polymerase; Transferase; Transmembrane; Virion. FT INIT_MET 1 1 Removed; by host. FT CHAIN 2 123 Protein C. FT /FTId=PRO_0000037743. FT PROPEP 124 215 FT /FTId=PRO_0000037744. FT CHAIN 216 290 Small envelope protein M. FT /FTId=PRO_0000037745. FT CHAIN 291 787 Envelope protein E. FT /FTId=PRO_0000037746. FT CHAIN 788 1139 Non-structural protein 1. FT /FTId=PRO_0000037747. FT CHAIN 1140 1370 Non-structural protein 2A. FT /FTId=PRO_0000037748. FT CHAIN 1371 1501 Flavivirin protease NS2B regulatory FT subunit. FT /FTId=PRO_0000037749. FT CHAIN 1502 2120 Flavivirin protease NS3 catalytic FT subunit. FT /FTId=PRO_0000037750. FT CHAIN 2121 2269 Non-structural protein 4A. FT /FTId=PRO_0000037751. FT CHAIN 2270 2525 Non-structural protein 4B. FT /FTId=PRO_0000037752. FT CHAIN 2526 3430 RNA-directed RNA polymerase. FT /FTId=PRO_0000037753. FT TRANSMEM 46 66 Potential. FT TRANSMEM 106 126 Potential. FT TRANSMEM 249 269 Potential. FT TRANSMEM 276 292 Potential. FT TRANSMEM 740 760 Potential. FT TRANSMEM 767 787 Potential. FT TRANSMEM 1139 1159 Potential. FT TRANSMEM 1171 1191 Potential. FT TRANSMEM 1213 1233 Potential. FT TRANSMEM 1244 1264 Potential. FT TRANSMEM 1279 1301 Potential. FT TRANSMEM 1341 1361 Potential. FT TRANSMEM 1372 1392 Potential. FT TRANSMEM 1396 1416 Potential. FT TRANSMEM 1474 1494 Potential. FT TRANSMEM 2171 2191 Potential. FT TRANSMEM 2197 2217 Potential. FT TRANSMEM 2219 2239 Potential. FT TRANSMEM 2255 2275 Potential. FT TRANSMEM 2310 2330 Potential. FT TRANSMEM 2356 2376 Potential. FT TRANSMEM 2378 2398 Potential. FT TRANSMEM 2442 2462 Potential. FT DOMAIN 1508 1679 Peptidase S7. FT DOMAIN 1682 1838 Helicase ATP-binding. FT DOMAIN 1849 2014 Helicase C-terminal. FT DOMAIN 3055 3207 RdRp catalytic. FT NP_BIND 1695 1702 ATP (Potential). FT REGION 388 401 Involved in fusion. FT MOTIF 1786 1789 DEAH box. FT ACT_SITE 1552 1552 Charge relay system (By similarity). FT ACT_SITE 1576 1576 Charge relay system (By similarity). FT ACT_SITE 1636 1636 Charge relay system (By similarity). FT CARBOHYD 138 138 N-linked (GlcNAc...) (Potential). FT CARBOHYD 917 917 N-linked (GlcNAc...) (Potential). FT CARBOHYD 962 962 N-linked (GlcNAc...) (Potential). FT CARBOHYD 994 994 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2336 2336 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2489 2489 N-linked (GlcNAc...) (Potential). FT DISULFID 293 320 FT DISULFID 350 406 FT DISULFID 364 395 FT DISULFID 382 411 FT DISULFID 476 574 FT DISULFID 591 622 FT STRAND 1423 1428 FT STRAND 1444 1449 FT STRAND 1455 1457 FT STRAND 1522 1527 FT STRAND 1536 1543 FT STRAND 1546 1550 FT HELIX 1551 1554 FT STRAND 1559 1561 FT STRAND 1564 1566 FT STRAND 1568 1572 FT TURN 1573 1576 FT STRAND 1577 1583 FT STRAND 1592 1594 FT STRAND 1596 1600 FT STRAND 1608 1612 FT STRAND 1615 1619 FT STRAND 1622 1627 FT HELIX 1633 1635 FT STRAND 1639 1641 FT STRAND 1647 1651 FT STRAND 1654 1656 FT STRAND 1662 1665 SQ SEQUENCE 3430 AA; 380110 MW; 42D71B7CB12DC45B CRC64; MSKKPGGPGK NRAVNMLKRG MPRGLSLIGL KRAMLSLIDG KGPIRFVLAL LAFFRFTAIA PTRAVLDRWR GVNKQTAMKH LLSFKKELGT LTSAINRRST KQKKRGGTAG FTILLGLIAC AGAVTLSNFQ GKVMMTVNAT DVTDVITIPT AAGKNLCIVR AMDVGYLCED TITYECPVLA AGNDPEDIDC WCTKSSVYVR YGRCTKTRHS RRSRRSLTVQ THGESTLANK KGAWLDSTKA TRYLVKTESW ILRNPGYALV AAVIGWMLGS NTMQRVVFAI LLLLVAPAYS FNCLGMSNRD FLEGVSGATW VDLVLEGDSC VTIMSKDKPT IDVKMMNMEA ANLADVRSYC YLASVSDLST RAACPTMGEA HNEKRADPAF VCKQGVVDRG WGNGCGLFGK GSIDTCAKFA CTTKATGWII QKENIKYEVA IFVHGPTTVE SHGKIGATQA GRFSITPSAP SYTLKLGEYG EVTVDCEPRS GIDTSAYYVM SVGEKSFLVH REWFMDLNLP WSSAGSTTWR NRETLMEFEE PHATKQSVVA LGSQEGALHQ ALAGAIPVEF SSNTVKLTSG HLKCRVKMEK LQLKGTTYGV CSKAFKFART PADTGHGTVV LELQYTGTDG PCKVPISSVA SLNDLTPVGR LVTVNPFVSV ATANSKVLIE LEPPFGDSYI VVGRGEQQIN HHWHKSGSSI GKAFTTTLRG AQRLAALGDT AWDFGSVGGV FTSVGKAIHQ VFGGAFRSLF GGMSWITQGL LGALLLWMGI NARDRSIAMT FLAVGGVLLF LSVNVHADTG CAIDIGRQEL RCGSGVFIHN DVEAWMDRYK FYPETPQGLA KIIQKAHAEG VCGLRSVSRL EHQMWEAIKD ELNTLLKENG VDLSVVVEKQ NGMYKAAPKR LAATTEKLEM GWKAWGKSII FAPELANNTF VIDGPETEEC PTANRAWNSM EVEDFGFGLT STRMFLRIRE TNTTECDSKI IGTAVKNNMA VHSDLSYWIE SGLNDTWKLE RAVLGEVKSC TWPETHTLWG DGVLESDLII PITLAGPRSN HNRRPGYKTQ NQGPWDEGRV EIDFDYCPGT TVTISDSCEH RGPAARTTTE SGKLITDWCC RSCTLPPLRF QTENGCWYGM EIRPTRHDEK TLVQSRVNAY NADMIDPFQL GLMVVFLATQ EVLRKRWTAK ISIPAIMLAL LVLVFGGITY TDVLRYVILV GAAFAEANSG GDVVHLALMA TFKIQPVFLV ASFLKARWTN QESILLMLAA AFFQMAYYDA KNVLSWEVPD VLNSLSVAWM ILRAISFTNT SNVVVPLLAL LTPGLKCLNL DVYRILLLMV GVGSLIKEKR SSAAKKKGAC LICLALASTG VFNPMILAAG LMACDPNRKR GWPATEVMTA VGLMFAIVGG LAELDIDSMA IPMTIAGLMF AAFVISGKST DMWIERTADI TWESDAEITG SSERVDVRLD DDGNFQLMND PGAPWKIWML RMACLAISAY TPWAILPSVI GFWITLQYTK RGGVLWDTPS PKEYKKGDTT TGVYRIMTRG LLGSYQAGAG VMVEGVFHTL WHTTKGAALM SGEGRLDPYW GSVKEDRLCY GGPWKLQHKW NGHDEVQMIV VEPGKNVKNV QTKPGVFKTP EGEIGAVTLD YPTGTSGSPI VDKNGDVIGL YGNGVIMPNG SYISAIVQGE RMEEPAPAGF EPEMLRKKQI TVLDLHPGAG KTRKILPQII KEAINKRLRT AVLAPTRVVA AEMSEALRGL PIRYQTSAVH REHSGNEIVD VMCHATLTHR LMSPHRVPNY NLFIMDEAHF TDPASIAARG YIATKVELGE AAAIFMTATP PGTSDPFPES NAPISDMQTE IPDRAWNTGY EWITEYVGKT VWFVPSVKMG NEIALCLQRA GKKVIQLNRK SYETEYPKCK NDDWDFVITT DISEMGANFK ASRVIDSRKS VKPTIIEEGD GRVILGEPSA ITAASAAQRR GRIGRNPSQV GDEYCYGGHT NEDDSNFAHW TEARIMLDNI NMPNGLVAQL YQPEREKVYT MDGEYRLRGE ERKNFLEFLR TADLPVWLAY KVAAAGISYH DRKWCFDGPR TNTILEDNNE VEVITKLGER KILRPRWADA RVYSDHQALK SFKDFASGKR SQIGLVEVLG RMPEHFMVKT WEALDTMYVV ATAEKGGRAH RMALEELPDA LQTIVLIALL SVMSLGVFFL LMQRKGIGKI GLGGVILGAA TFFCWMAEVP GTKIAGMLLL SLLLMIVLIP EPEKQRSQTD NQLAVFLICV LTLVGAVAAN EMGWLDKTKN DIGSLLGHRP EARETTLGVE SFLLDLRPAT AWSLYAVTTA VLTPLLKHLI TSDYINTSLT SINVQASALF TLARGFPFVD VGVSALLLAV GCWGQVTLTV TVTAAALLFC HYAYMVPGWQ AEAMRSAQRR TAAGIMKNVV VDGIVATDVP ELERTTPVMQ KKVGQIILIL VSMAAVVVNP SVRTVREAGI LTTAAAVTLW ENGASSVWNA TTAIGLCHIM RGGWLSCLSI MWTLIKNMEK PGLKRGGAKG RTLGEVWKER LNHMTKEEFT RYRKEAITEV DRSAAKHARR EGNITGGHPV SRGTAKLRWL VERRFLEPVG KVVDLGCGRG GWCYYMATQK RVQEVKGYTK GGPGHEEPQL VQSYGWNIVT MKSGVDVFYR PSEASDTLLC DIGESSSSAE VEEHRTVRVL EMVEDWLHRG PKEFCIKVLC PYMPKVIEKM ETLQRRYGGG LIRNPLSRNS THEMYWVSHA SGNIVHSVNM TSQVLLGRME KKTWKGPQFE EDVNLGSGTR AVGKPLLNSD TSKIKNRIER LKKEYSSTWH QDANHPYRTW NYHGSYEVKP TGSASSLVNG VVRLLSKPWD TITNVTTMAM TDTTPFGQQR VFKEKVDTKA PEPPEGVKYV LNETTNWLWA FLARDKKPRM CSREEFIGKV NSNAALGAMF EEQNQWKNAR EAVEDPKFWE MVDEEREAHL RGECNTCIYN MMGKREKKPG EFGKAKGSRA IWFMWLGARF LEFEALGFLN EDHWLGRKNS GGGVEGLGLQ KLGYILKEVG TKPGGKVYAD DTAGWDTRIT KADLENEAKV LELLDGEHRR LARSIIELTY RHKVVKVMRP AADGKTVMDV ISREDQRGSG QVVTYALNTF TNLAVQLVRM MEGEGVIGPD DVEKLGKGKG PKVRTWLFEN GEERLSRMAV SGDDCVVKPL DDRFATSLHF LNAMSKVRKD IQEWKPSTGW YDWQQVPFCS NHFTELIMKD GRTLVVPCRG QDELIGRARI SPGAGWNVRD TACLAKSYAQ MWLLLYFHRR DLRLMANAIC SAVPANWVPT GRTTWSIHAK GEWMTTEDML AVWNRVWIEE NEWMEDKTPV ERWSDVPYSG KREDIWCGSL IGTRTRATWA ENIHVAINQV RSVIGEEKYV DYMSSLRRYE DTIVVEDTVL //