ID   POLG_DEN27     STANDARD;      PRT;  3391 AA.
AC   P29991;
DT   01-APR-1993 (REL. 25, CREATED)
DT   01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE)
DT   01-FEB-1994 (REL. 28, LAST ANNOTATION UPDATE)
DE   GENOME POLYPROTEIN (CONTAINS: CAPSID PROTEIN C (CORE PROTEIN); MATRIX
DE   PROTEIN (ENVELOPE GLYCOPROTEIN M); MAJOR ENVELOPE PROTEIN E;
DE   NONSTRUCTURAL PROTEINS NS1, NS2A, NS2B, NS4A AND NS4B; HELICASE (NS3);
DE   RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48) (NS5)).
OS   DENGUE VIRUS TYPE 2 (STRAIN 16681-PDK53).
OC   VIRIDAE; SS-RNA ENVELOPED VIRUSES; POSITIVE-STRAND; FLAVIVIRIDAE;
OC   FLAVIVIRUSES.
RN   [1]
RP   SEQUENCE FROM N.A.
RM   92188532
RA   BLOK J., MCWILLIAM S.M., BUTLER H.C., GIBBS A.J., WEILLER G.,
RA   HERRING B.L., HEMSLEY A.C., AASKOV J.G., YOKSAN S.,
RA   BHAMARAPRAVATI N.;
RL   VIROLOGY 187:573-590(1992).
CC   -!- FUNCTION: THE SMALL PROTEINS NS2A, NS2B, NS4A AND NS4B ARE
CC       HYDROPHOBIC, SUGGESTING A POSSIBLE MEMBRANE-RELATED FUNCTION.
CC       NS3 AND NS5 MAY PLAY A ROLE IN THE VIRAL RNA REPLICATION.
CC   -!- SUBUNIT: THE VIRION OF THIS VIRUS IS A NUCLEOCAPSID COVERED BY A
CC       LIPOPROTEIN ENVELOPE. THE ENVELOPE CONSISTS OF TWO PROTEINS:
CC       PROTEIN M AND GLYCOPROTEIN E. THE NUCLEOCAPSID IS A COMPLEX OF
CC       PROTEIN C AND MRNA.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; M84727; DENCMEMSB.
DR   PIR; B42451; GNWV26.
KW   POLYPROTEIN; GLYCOPROTEIN; RNA-DIRECTED RNA POLYMERASE; CORE PROTEIN;
KW   COAT PROTEIN; ENVELOPE PROTEIN; HELICASE; ATP-BINDING; TRANSMEMBRANE;
KW   NONSTRUCTURAL PROTEIN.
FT   CHAIN         1    114       CAPSID PROTEIN C.
FT   PROPEP      115    205
FT   CHAIN       206    280       ENVELOPE GLYCOPROTEIN M.
FT   CHAIN       281    775       ENVELOPE PROTEIN E.
FT   CHAIN       776   1127       NONSTRUCTURAL PROTEIN NS1.
FT   CHAIN      1128   1345       NONSTRUCTURAL PROTEIN NS2A.
FT   CHAIN      1346   1474       NONSTRUCTURAL PROTEIN NS2B.
FT   CHAIN      1475   2093       HELICASE (NS3).
FT   CHAIN      2094   2243       NONSTRUCTURAL PROTEIN NS4A.
FT   CHAIN      2244   2491       NONSTRUCTURAL PROTEIN NS4B.
FT   CHAIN      2492   3391       RNA-DIRECTED RNA POLYMERASE (NS5).
FT   NP_BIND    1668   1675       ATP (POTENTIAL).
FT   SITE       1759   1762       DEAH BOX.
FT   TRANSMEM     50     66       POTENTIAL.
FT   TRANSMEM    102    118       POTENTIAL.
FT   TRANSMEM    268    284       POTENTIAL.
FT   TRANSMEM    727    743       POTENTIAL.
FT   TRANSMEM    757    773       POTENTIAL.
FT   TRANSMEM   1158   1174       POTENTIAL.
FT   TRANSMEM   1272   1288       POTENTIAL.
FT   TRANSMEM   1294   1310       POTENTIAL.
FT   TRANSMEM   1351   1367       POTENTIAL.
FT   TRANSMEM   1373   1389       POTENTIAL.
FT   TRANSMEM   1448   1464       POTENTIAL.
FT   TRANSMEM   2148   2164       POTENTIAL.
FT   TRANSMEM   2174   2190       POTENTIAL.
FT   TRANSMEM   2197   2213       POTENTIAL.
FT   TRANSMEM   2227   2243       POTENTIAL.
FT   TRANSMEM   2352   2368       POTENTIAL.
FT   TRANSMEM   2411   2427       POTENTIAL.
FT   CARBOHYD    183    183       POTENTIAL.
FT   CARBOHYD    347    347       POTENTIAL.
FT   CARBOHYD    433    433       POTENTIAL.
FT   CARBOHYD    905    905       POTENTIAL.
FT   CARBOHYD    982    982       POTENTIAL.
FT   CARBOHYD   1134   1134       POTENTIAL.
FT   CARBOHYD   1174   1174       POTENTIAL.
FT   CARBOHYD   2301   2301       POTENTIAL.
FT   CARBOHYD   2305   2305       POTENTIAL.
FT   CARBOHYD   2457   2457       POTENTIAL.
FT   CARBOHYD   2485   2485       POTENTIAL.
FT   CARBOHYD   2665   2665       POTENTIAL.
FT   CARBOHYD   2704   2704       POTENTIAL.
FT   CARBOHYD   2714   2714       POTENTIAL.
SQ   SEQUENCE   3391 AA;  379878 MW;  22199405 CN;
     MNDQRKEAKN TPFNMLKRER NRVSTVQQLT KRFSLGMLQG RGPLKLYMAL VAFLRFLTIP
     PTAGILKRWG TIKKSKAINV LRGFRKEIGR MLNILNRRRR SAGMIIMLIP TVMAFHLTTR
     NGEPHMIVSR QEKGKSLLFK TEVGVNMCTL MAMDLGELCE DTITYKCPLL RQNEPEDIDC
     WCNSTSTWVT YGTCTTMGEH RREKRSVALV PHVGMGLETR TETWMSSEGA WKHVQRIETW
     ILRHPGFTMM AAILAYTIGT THFQRALILI LLTAVTPSMT MRCIGMSNRD FVEGVSGGSW
     VDIVLEHGSC VTTMAKNKPT LDFELIKTEA KQPATLRKYC IEAKLTNTTT ESRCPTQGEP
     SLNEEQDKRF VCKHSMVDRG WGNGCGLFGK GGIVTCAMFR CKKNMEGKVV QPENLEYTIV
     ITPHSGEEHA VGNDTGKHGK EIKITPQSSI TEAELTGYGT ITMECSPRTG LDFNEIVLLQ
     MENKAWLVHR QWFLDLPLPW LPGADTQGSN WIQKETLVTF KNPHAKKQDV VVLGSQEGAM
     HTALTGATEI QMSSGNLLFT GHLKCRLRMD KLQLKGMSYS MCTGKFKVVK EIAETQHGTI
     VIRVQYEGDG SPCKIPFEIM DLEKRHVLGR LITVNPIVTE KDSPVNIEAE PPFGDSYIII
     GVEPGQLKLN WFKKGSSIGQ MFETTMRGAK RMAILGDTAW DFGSLGGVFT SIGKALHQVF
     GAIYGAAFSG VSWTMKILIG VIITWIGMNS RSTSLSVTLV LVGIVTLYLG VMVQADSGCV
     VSWKNKELKC GSGIFITDNV HTWTEQYKFQ PESPSKLASA IQKAHEEDIC GIRSVTRLEN
     LMWKQITPEL NHILSENEVK LTIMTGDIKG IMQAGKRSLR PQPTELKYSW KTWGKAKMLS
     TESHNQTFFI DGPETAECPN TNRAWNSLEV EDYGFGVFTT NIWLKLKEKQ DVFCDSKLMS
     AAIKDNRAVH ADMGYWIESA LNDTWKIEKA SFIEVKNCHW PKSHTLWSNG VLESEMIIPK
     NLAGPVSKHN YRPGYHTQIT GPWHLGKLEM DFDFCDGTTV VVTEDCGNRG PSLRTTTASG
     KLITEWCCRS CTLPPLRYRG EDGCWYGMEI RPLKEKEENL VNSLVTAGHG QVDNFSLGVL
     GMALFLEEML RTRVGTKHAI LLVAVSFVTL IIGNRSFRDL GRVMVMVGAT MTDDIGMGVT
     YLALLAAFKV RPTFAAGLLL RKLTSKELMM TTIGIVLSSQ STIPETILEL TDALALGMMV
     LKMVRNMEKY QLAVTIMAIL CVPNAVILQN AWKVSCTILA VVSVSPLFLT SSQQKTDWIP
     LALTIKGLNP TAIFLTTLSR TSKKRSWPLN EAIMAVGMVS ILASSLLKND IPMTGPLVAG
     GLLTVCYVLT GRSADLELER AADVKWEDQA EISGSSPILS ITISEDGSMS IKNEEEEQTL
     TILIRRGLLV ISGLFPVSIP ITAAAWYLWE VKKQRAGVLW DVPSPPPMGK AELEDGAYRI
     KQKGILGYSQ IGAGVYKEGT FHTMWHVTRG AVLMHKGKRI EPSWADVKKD LISYGGGWKL
     EGEWKEGEEV QVLALDPGKN PRAVQTKPGL FKTNAGTIGA VSLDFSPGTS GSPIIDKKGK
     VVGLYGNGVV TRSGAYVSAI AQTEKSIEDN PEIEDDIFRK RRLTIMDLHP GAGKTKRYLP
     AIVREAIKRG LRTLILAPTR VVAAEMEEAL RGLPIRYQTP AIRAEHTGRE IVDLMCHATF
     TMRLLSPVRV PNYNLIIMDE AHFTDPASIA ARGYISTRVE MGEAAGIFMT ATPPGSRDPF
     PQSNAPIIDE EREIPERSWN SGHEWVTDFK GKTVWFVPSI KAGNDIAACL RKNGKKVIQL
     SRKTFDSEYV KTRTNDWDFV VTTDISEMGA NFKAERVIDP RRCMKPVILT DGEERVILAG
     PMPVTHSSAA QRRGRIGRNP KNENDQYIYM GEPLENDEDC AHWKEAKMLL DNINTPEGII
     PSMFEPEREK VDAIDGEYRL RGEARTTFVD LMRRGDLPVW LAYRVAAEGI NYADRRWCFD
     GVKNNQILEE NVEVEIWTKE GERKKLKPRW LDARIYSDPL ALKEFKEFAA GRKSLTLNLI
     TEMGRLPTFM TQKARNALDN LAVLHTAEAG GRAYNHALSE LPETLETLLL LTLLATVTGG
     IFLFLMSARG IGKMTLGMCC IITASILLWY AQIQPHWIAA SIILEFFLIV LLIPEPEKQR
     TPQDNQLTYV VIAILTVVAA TMANEMGFLE KTKKDLGLGS IATQQPESNI LDIDLRPASA
     WTLYAVATTF VTPMLRHSIE NSSVNVSLTA IANQATVLMG LGKGWPLSKM DIGVPLLAIG
     CYSQVNPTTL TAALFLLVAH YAIIGPALQA KASREAQKRA AAGIMKNPTV DGITVIDLDP
     IPYDPKFEKQ LGQVMLLVLC VTQVLMMRTT WALCEALTLA TGPISTLSEG NPGRFWNTTI
     AVSMANIFRG SYLAGAGLLF SIMKNTTNTR RVTGNIGETL GEKWKSRLNA LGKSEFQIYK
     KSGIQEVDRT LAKEGIKRGE TDHHAVSRGS AKLRWFVERN MVTPEGKVVD LGCGRGGWSY
     YCGGLKNVRE VKGLTKGGPG HEEPIPMSTY GWNLVRLQSG VDVFFIPPEK CDTLLCDIGE
     SSPNPTVEAG RTLRVLNLVE NWLNNNTQFC IKVLNPYMPS VIEKMEALQR KYGGALVRNP
     LSRNSTHEMY WVSNASGNIV SSVNMISRML INRFTMRYKK ATYEPDVDLG SGTRNIGIES
     EIPNLDIIGK RIEKIKQEHE TSWHYDQDHP YKTWAYHGSY ETKQTGSASS MVNGVFRLLT
     KPWDVVPMVT QMAMTDTTPF GQQRVFKEKV DTRTQEPKEG TKKLMKITAE WLWKELGKKK
     TPRMCTREEF TRKVRSNAAL GAIFTDENKW KSAREAVEDS RFWELVDKER NLHLEGKCET
     CVYNIMGKRE KKLGEFGKAK GSRAIWYMWL GARFLEFEAL GFLNEDHWFS RENSLSGVEG
     EGLHKLGYIL RDVSKKEGGA MYADDTAGWD TRITLEDLKN EEMVTNHMEG EHKKLAEAIF
     KLTYQNKVVR VQRPTPRGTV MDIISRRDQR GSGQVGTYGL NTFTNMEAQL IRQMEGEGVF
     KSIQHLTITE EIAVQNWLAR VGRERLSRMA ISGDDCVVKP LDDRLPSALT ALNDMGKIRK
     DIQQWEPSRG WNDWTQVPFC SHHFHELIMK DGRVLVVPCR NQDELIGRAR ISQGAGWSLR
     ETACLGKSYA QMWSLMYFHR RDLRLAANAI CSAVPSHWVP TSRTTWSIHA KHEWMTTEDM
     LTVWNRVWIQ ENPWMEDKTP VESWEEIPYL GKREDQWCGS LIGLTSRATW AKNIQAAINQ
     VRSLIGNEEY TDYMPSMKRF RREEEEAGVL W
//