ID   POLG_DEN27     STANDARD;      PRT;  3391 AA.
AC   P29991;
DT   01-APR-1993 (Rel. 25, Created)
DT   01-APR-1993 (Rel. 25, Last sequence update)
DT   01-FEB-2005 (Rel. 46, Last annotation update)
DE   Genome polyprotein [Contains: Capsid protein C (Core protein);
DE   Envelope protein M (Matrix protein); Major envelope protein E;
DE   Nonstructural protein 1 (NS1); Nonstructural protein 2A (NS2A);
DE   Flavivirin protease NS2B regulatory subunit; Flavivirin protease NS3
DE   catalytic subunit (EC 3.4.21.91); Nonstructural protein 4A (NS4A);
DE   Nonstructural protein 4B (NS4B); RNA-directed RNA polymerase
DE   (EC 2.7.7.48) (NS5)].
OS   Dengue virus type 2 (strain 16681-PDK53).
OC   Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae;
OC   Flavivirus.
OX   NCBI_TaxID=31635;
RN   [1]
RP   NUCLEOTIDE SEQUENCE.
RX   MEDLINE=92188532; PubMed=1312269;
RA   Blok J., McWilliam S.M., Butler H.C., Gibbs A.J., Weiller G.,
RA   Herring B.L., Hemsley A.C., Aaskov J.G., Yoksan S., Bhamarapravati N.;
RT   "Comparison of a dengue-2 virus and its candidate vaccine derivative:
RT   sequence relationships with the flaviviruses and other viruses.";
RL   Virology 187:573-590(1992).
CC   -!- FUNCTION: The small proteins NS2A, NS4A and NS4B are hydrophobic,
CC       suggesting a possible membrane-related function. NS5 may play a
CC       role in the viral RNA replication. The NS2B/NS3 protease complex
CC       processes the viral polyprotein.
CC   -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds
CC       in which each of the Xaa can be either Arg or Lys and Yaa can be
CC       either Ser or Ala.
CC   -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate
CC       + RNA(n+1).
CC   -!- SUBUNIT: NS3 and NS2B form a heterodimer. NS3 is the catalytic
CC       subunit, whereas NS2B strongly stimulates the latter (By
CC       similarity).
CC   -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins
CC       (By similarity).
CC   -!- MISCELLANEOUS: The virion of this virus is a nucleocapsid covered
CC       by a lipoprotein envelope. The envelope contains two proteins: the
CC       protein M and glycoprotein E. The nucleocapsid is a complex of
CC       protein C and mRNA. In immature particles, there are 60
CC       icosaedrally organized trimeric spikes on the surface. Each spike
CC       consists of three heterodimers of envelope protein M precursor
CC       (prM) and envelope protein E (By similarity).
CC   -!- SIMILARITY: Contains 1 peptidase S7 domain.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; M84728; AAA73186.1; -.
DR   EMBL; M84727; -; NOT_ANNOTATED_CDS.
DR   PIR; B42451; GNWV26.
DR   HSSP; Q88653; 1OKE.
DR   InterPro; IPR001410; DEAD.
DR   InterPro; IPR002464; DEAH_box.
DR   InterPro; IPR001122; Flavi_capsidC.
DR   InterPro; IPR000336; Flavi_glycoprotE.
DR   InterPro; IPR000069; Flavi_M.
DR   InterPro; IPR001157; Flavi_NS1.
DR   InterPro; IPR000752; Flavi_NS2A.
DR   InterPro; IPR000487; Flavi_NS2B.
DR   InterPro; IPR000404; Flavi_NS4A.
DR   InterPro; IPR001528; Flavi_NS4B.
DR   InterPro; IPR000208; Flavi_NS5.
DR   InterPro; IPR002535; Flavi_propep.
DR   InterPro; IPR001850; Peptidase_S7.
DR   InterPro; IPR007095; RNA_pol_DS_PS.
DR   InterPro; IPR007094; RNA_pol_PSvir.
DR   InterPro; IPR002877; RrmJFtsJ_mtfrase.
DR   Pfam; PF01003; Flavi_capsid; 1.
DR   Pfam; PF02832; Flavi_glycop_C; 1.
DR   Pfam; PF00869; Flavi_glycoprot; 1.
DR   Pfam; PF01004; Flavi_M; 1.
DR   Pfam; PF00948; Flavi_NS1; 1.
DR   Pfam; PF01005; Flavi_NS2A; 1.
DR   Pfam; PF01002; Flavi_NS2B; 1.
DR   Pfam; PF01350; Flavi_NS4A; 1.
DR   Pfam; PF01349; Flavi_NS4B; 1.
DR   Pfam; PF00972; Flavi_NS5; 1.
DR   Pfam; PF01570; Flavi_propep; 1.
DR   Pfam; PF01728; FtsJ; 1.
DR   Pfam; PF00949; Peptidase_S7; 1.
DR   ProDom; PD001556; Flavi_glycoprotE; 1.
DR   ProDom; PD001496; Flavi_NS1; 1.
DR   PROSITE; PS00690; DEAH_ATP_HELICASE; FALSE_NEG.
KW   ATP-binding; Capsid protein; Core protein; Envelope protein;
KW   Glycoprotein; Helicase; Hydrolase; Polyprotein;
KW   RNA-directed RNA polymerase; Transferase; Transmembrane.
FT   CHAIN         1    114       Capsid protein C.
FT   PROPEP      115    205
FT   CHAIN       206    280       Envelope protein M.
FT   CHAIN       281    775       Major envelope protein E.
FT   CHAIN       776   1127       Nonstructural protein 1.
FT   CHAIN      1128   1345       Nonstructural protein 2A.
FT   CHAIN      1346   1475       Flavivirin protease NS2B regulatory
FT                                subunit.
FT   CHAIN      1476   2093       Flavivirin protease NS3 catalytic
FT                                subunit.
FT   CHAIN      2094   2243       Nonstructural protein 4A.
FT   CHAIN      2244   2491       Nonstructural protein 4B.
FT   CHAIN      2492   3391       RNA-directed RNA polymerase.
FT   NP_BIND    1668   1675       ATP (Potential).
FT   ACT_SITE   1526   1526       Charge relay system (By similarity).
FT   ACT_SITE   1550   1550       Charge relay system (By similarity).
FT   ACT_SITE   1610   1610       Charge relay system (By similarity).
FT   SITE       1759   1762       DEAH box.
FT   TRANSMEM     50     66       Potential.
FT   TRANSMEM    102    118       Potential.
FT   TRANSMEM    268    284       Potential.
FT   TRANSMEM    727    743       Potential.
FT   TRANSMEM    757    773       Potential.
FT   TRANSMEM   1158   1174       Potential.
FT   TRANSMEM   1272   1288       Potential.
FT   TRANSMEM   1294   1310       Potential.
FT   TRANSMEM   1351   1367       Potential.
FT   TRANSMEM   1373   1389       Potential.
FT   TRANSMEM   1448   1464       Potential.
FT   TRANSMEM   2148   2164       Potential.
FT   TRANSMEM   2174   2190       Potential.
FT   TRANSMEM   2197   2213       Potential.
FT   TRANSMEM   2227   2243       Potential.
FT   TRANSMEM   2352   2368       Potential.
FT   TRANSMEM   2411   2427       Potential.
FT   DISULFID    283    310       By similarity.
FT   DISULFID    340    401       By similarity.
FT   DISULFID    354    385       By similarity.
FT   DISULFID    372    396       By similarity.
FT   DISULFID    465    565       By similarity.
FT   DISULFID    582    613       By similarity.
FT   CARBOHYD    183    183       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD    347    347       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD    433    433       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD    905    905       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD    982    982       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   1134   1134       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2301   2301       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2305   2305       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2457   2457       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2485   2485       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2665   2665       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2704   2704       N-linked (GlcNAc...) (Potential).
FT   CARBOHYD   2714   2714       N-linked (GlcNAc...) (Potential).
SQ   SEQUENCE   3391 AA;  379884 MW;  5EE11A74081C5177 CRC64;
     MNDQRKEAKN TPFNMLKRER NRVSTVQQLT KRFSLGMLQG RGPLKLYMAL VAFLRFLTIP
     PTAGILKRWG TIKKSKAINV LRGFRKEIGR MLNILNRRRR SAGMIIMLIP TVMAFHLTTR
     NGEPHMIVSR QEKGKSLLFK TEVGVNMCTL MAMDLGELCE DTITYKCPLL RQNEPEDIDC
     WCNSTSTWVT YGTCTTMGEH RREKRSVALV PHVGMGLETR TETWMSSEGA WKHVQRIETW
     ILRHPGFTMM AAILAYTIGT THFQRALILI LLTAVTPSMT MRCIGMSNRD FVEGVSGGSW
     VDIVLEHGSC VTTMAKNKPT LDFELIKTEA KQPATLRKYC IEAKLTNTTT ESRCPTQGEP
     SLNEEQDKRF VCKHSMVDRG WGNGCGLFGK GGIVTCAMFR CKKNMEGKVV QPENLEYTIV
     ITPHSGEEHA VGNDTGKHGK EIKITPQSSI TEAELTGYGT ITMECSPRTG LDFNEIVLLQ
     MENKAWLVHR QWFLDLPLPW LPGADTQGSN WIQKETLVTF KNPHAKKQDV VVLGSQEGAM
     HTALTGATEI QMSSGNLLFT GHLKCRLRMD KLQLKGMSYS MCTGKFKVVK EIAETQHGTI
     VIRVQYEGDG SPCKIPFEIM DLEKRHVLGR LITVNPIVTE KDSPVNIEAE PPFGDSYIII
     GVEPGQLKLN WFKKGSSIGQ MFETTMRGAK RMAILGDTAW DFGSLGGVFT SIGKALHQVF
     GAIYGAAFSG VSWTMKILIG VIITWIGMNS RSTSLSVTLV LVGIVTLYLG VMVQADSGCV
     VSWKNKELKC GSGIFITDNV HTWTEQYKFQ PESPSKLASA IQKAHEEDIC GIRSVTRLEN
     LMWKQITPEL NHILSENEVK LTIMTGDIKG IMQAGKRSLR PQPTELKYSW KTWGKAKMLS
     TESHNQTFFI DGPETAECPN TNRAWNSLEV EDYGFGVFTT NIWLKLKEKQ DVFCDSKLMS
     AAIKDNRAVH ADMGYWIESA LNDTWKIEKA SFIEVKNCHW PKSHTLWSNG VLESEMIIPK
     NLAGPVSKHN YRPGYHTQIT GPWHLGKLEM DFDFCDGTTV VVTEDCGNRG PSLRTTTASG
     KLITEWCCRS CTLPPLRYRG EDGCWYGMEI RPLKEKEENL VNSLVTAGHG QVDNFSLGVL
     GMALFLEEML RTRVGTKHAI LLVAVSFVTL IIGNRSFRDL GRVMVMVGAT MTDDIGMGVT
     YLALLAAFKV RPTFAAGLLL RKLTSKELMM TTIGIVLSSQ STIPETILEL TDALALGMMV
     LKMVRNMEKY QLAVTIMAIL CVPNAVILQN AWKVSCTILA VVSVSPLFLT SSQQKTDWIP
     LALTIKGLNP TAIFLTTLSR TSKKRSWPLN EAIMAVGMVS ILASSLLKND IPMTGPLVAG
     GLLTVCYVLT GRSADLELER AADVKWEDQA EISGSSPILS ITISEDGSMS IKNEEEEQTL
     TILIRRGLLV ISGLFPVSIP ITAAAWYLWE VKKQRAGVLW DVPSPPPMGK AELEDGAYRI
     KQKGILGYSQ IGAGVYKEGT FHTMWHVTRG AVLMHKGKRI EPSWADVKKD LISYGGGWKL
     EGEWKEGEEV QVLALDPGKN PRAVQTKPGL FKTNAGTIGA VSLDFSPGTS GSPIIDKKGK
     VVGLYGNGVV TRSGAYVSAI AQTEKSIEDN PEIEDDIFRK RRLTIMDLHP GAGKTKRYLP
     AIVREAIKRG LRTLILAPTR VVAAEMEEAL RGLPIRYQTP AIRAEHTGRE IVDLMCHATF
     TMRLLSPVRV PNYNLIIMDE AHFTDPASIA ARGYISTRVE MGEAAGIFMT ATPPGSRDPF
     PQSNAPIIDE EREIPERSWN SGHEWVTDFK GKTVWFVPSI KAGNDIAACL RKNGKKVIQL
     SRKTFDSEYV KTRTNDWDFV VTTDISEMGA NFKAERVIDP RRCMKPVILT DGEERVILAG
     PMPVTHSSAA QRRGRIGRNP KNENDQYIYM GEPLENDEDC AHWKEAKMLL DNINTPEGII
     PSMFEPEREK VDAIDGEYRL RGEARTTFVD LMRRGDLPVW LAYRVAAEGI NYADRRWCFD
     GVKNNQILEE NVEVEIWTKE GERKKLKPRW LDARIYSDPL ALKEFKEFAA GRKSLTLNLI
     TEMGRLPTFM TQKARNALDN LAVLHTAEAG GRAYNHALSE LPETLETLLL LTLLATVTGG
     IFLFLMSARG IGKMTLGMCC IITASILLWY AQIQPHWIAA SIILEFFLIV LLIPEPEKQR
     TPQDNQLTYV VIAILTVVAA TMANEMGFLE KTKKDLGLGS IATQQPESNI LDIDLRPASA
     WTLYAVATTF VTPMLRHSIE NSSVNVSLTA IANQATVLMG LGKGWPLSKM DIGVPLLAIG
     CYSQVNPTTL TAALFLLVAH YAIIGPALQA KASREAQKRA AAGIMKNPTV DGITVIDLDP
     IPYDPKFEKQ LGQVMLLVLC VTQVLMMRTT WALCEALTLA TGPISTLSEG NPGRFWNTTI
     AVSMANIFRG SYLAGAGLLF SIMKNTTNTR RVTGNIGETL GEKWKSRLNA LGKSEFQIYK
     KSGIQEVDRT LAKEGIKRGE TDHHAVSRGS AKLRWFVERN MVTPEGKVVD LGCGRGGWSY
     YCGGLKNVRE VKGLTKGGPG HEEPIPMSTY GWNLVRLQSG VDVFFIPPEK CDTLLCDIGE
     SSPNPTVEAG RTLRVLNLVE NWLNNNTQFC IKVLNPYMPS VIEKMEALQR KYGGALVRNP
     LSRNSTHEMY WVSNASGNIV SSVNMISRML INRFTMRYKK ATYEPDVDLG SGTRNIGIES
     EIPNLDIIGK RIEKIKQEHE TSWHYDQDHP YKTWAYHGSY ETKQTGSASS MVNGVFRLLT
     KPWDVVPMVT QMAMTDTTPF GQQRVFKEKV DTRTQEPKEG TKKLMKITAE WLWKELGKKK
     TPRMCTREEF TRKVRSNAAL GAIFTDENKW KSAREAVEDS RFWELVDKER NLHLEGKCET
     CVYNIMGKRE KKLGEFGKAK GSRAIWYMWL GARFLEFEAL GFLNEDHWFS RENSLSGVEG
     EGLHKLGYIL RDVSKKEGGA MYADDTAGWD TRITLEDLKN EEMVTNHMEG EHKKLAEAIF
     KLTYQNKVVR VQRPTPRGTV MDIISRRDQR GSGQVGTYGL NTFTNMEAQL IRQMEGEGVF
     KSIQHLTITE EIAVQNWLAR VGRERLSRMA ISGDDCVVKP LDDRLPSALT ALNDMGKIRK
     DIQQWEPSRG WNDWTQVPFC SHHFHELIMK DGRVLVVPCR NQDELIGRAR ISQGAGWSLR
     ETACLGKSYA QMWSLMYFHR RDLRLAANAI CSAVPSHWVP TSRTTWSIHA KHEWMTTEDM
     LTVWNRVWIQ ENPWMEDKTP VESWEEIPYL GKREDQWCGS LIGLTSRATW AKNIQAAINQ
     VRSLIGNEEY TDYMPSMKRF RREEEEAGVL W
//