ID POLG_DEN27 STANDARD; PRT; 3391 AA. AC P29991; DT 01-APR-1993 (REL. 25, CREATED) DT 01-APR-1993 (REL. 25, LAST SEQUENCE UPDATE) DT 01-NOV-1995 (REL. 32, LAST ANNOTATION UPDATE) DE GENOME POLYPROTEIN (CONTAINS: CAPSID PROTEIN C (CORE PROTEIN); MATRIX DE PROTEIN (ENVELOPE GLYCOPROTEIN M); MAJOR ENVELOPE PROTEIN E; DE NONSTRUCTURAL PROTEINS NS1, NS2A, NS2B, NS4A AND NS4B; HELICASE (NS3); DE RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48) (NS5)). OS DENGUE VIRUS TYPE 2 (STRAIN 16681-PDK53). OC VIRIDAE; SS-RNA ENVELOPED VIRUSES; POSITIVE-STRAND; FLAVIVIRIDAE; OC FLAVIVIRUSES. RN [1] RP SEQUENCE FROM N.A. RX MEDLINE; 92188532. RA BLOK J., MCWILLIAM S.M., BUTLER H.C., GIBBS A.J., WEILLER G., RA HERRING B.L., HEMSLEY A.C., AASKOV J.G., YOKSAN S., RA BHAMARAPRAVATI N.; RL VIROLOGY 187:573-590(1992). CC -!- FUNCTION: THE SMALL PROTEINS NS2A, NS2B, NS4A AND NS4B ARE CC HYDROPHOBIC, SUGGESTING A POSSIBLE MEMBRANE-RELATED FUNCTION. CC NS3 AND NS5 MAY PLAY A ROLE IN THE VIRAL RNA REPLICATION. CC -!- SUBUNIT: THE VIRION OF THIS VIRUS IS A NUCLEOCAPSID COVERED BY A CC LIPOPROTEIN ENVELOPE. THE ENVELOPE CONSISTS OF TWO PROTEINS: CC PROTEIN M AND GLYCOPROTEIN E. THE NUCLEOCAPSID IS A COMPLEX OF CC PROTEIN C AND MRNA. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M84728; G323471; -. DR EMBL; M84727; -; NOT_ANNOTATED_CDS. DR PIR; B42451; GNWV26. KW POLYPROTEIN; GLYCOPROTEIN; RNA-DIRECTED RNA POLYMERASE; CORE PROTEIN; KW COAT PROTEIN; ENVELOPE PROTEIN; HELICASE; ATP-BINDING; TRANSMEMBRANE; KW NONSTRUCTURAL PROTEIN. FT CHAIN 1 114 CAPSID PROTEIN C. FT PROPEP 115 205 FT CHAIN 206 280 ENVELOPE GLYCOPROTEIN M. FT CHAIN 281 775 ENVELOPE PROTEIN E. FT CHAIN 776 1127 NONSTRUCTURAL PROTEIN NS1. FT CHAIN 1128 1345 NONSTRUCTURAL PROTEIN NS2A. FT CHAIN 1346 1474 NONSTRUCTURAL PROTEIN NS2B. FT CHAIN 1475 2093 HELICASE (NS3). FT CHAIN 2094 2243 NONSTRUCTURAL PROTEIN NS4A. FT CHAIN 2244 2491 NONSTRUCTURAL PROTEIN NS4B. FT CHAIN 2492 3391 RNA-DIRECTED RNA POLYMERASE (NS5). FT NP_BIND 1668 1675 ATP (POTENTIAL). FT SITE 1759 1762 DEAH BOX. FT TRANSMEM 50 66 POTENTIAL. FT TRANSMEM 102 118 POTENTIAL. FT TRANSMEM 268 284 POTENTIAL. FT TRANSMEM 727 743 POTENTIAL. FT TRANSMEM 757 773 POTENTIAL. FT TRANSMEM 1158 1174 POTENTIAL. FT TRANSMEM 1272 1288 POTENTIAL. FT TRANSMEM 1294 1310 POTENTIAL. FT TRANSMEM 1351 1367 POTENTIAL. FT TRANSMEM 1373 1389 POTENTIAL. FT TRANSMEM 1448 1464 POTENTIAL. FT TRANSMEM 2148 2164 POTENTIAL. FT TRANSMEM 2174 2190 POTENTIAL. FT TRANSMEM 2197 2213 POTENTIAL. FT TRANSMEM 2227 2243 POTENTIAL. FT TRANSMEM 2352 2368 POTENTIAL. FT TRANSMEM 2411 2427 POTENTIAL. FT DISULFID 283 310 BY SIMILARITY. FT DISULFID 340 396 BY SIMILARITY. FT DISULFID 354 385 BY SIMILARITY. FT DISULFID 372 401 BY SIMILARITY. FT DISULFID 465 565 BY SIMILARITY. FT DISULFID 582 613 BY SIMILARITY. FT CARBOHYD 183 183 POTENTIAL. FT CARBOHYD 347 347 POTENTIAL. FT CARBOHYD 433 433 POTENTIAL. FT CARBOHYD 905 905 POTENTIAL. FT CARBOHYD 982 982 POTENTIAL. FT CARBOHYD 1134 1134 POTENTIAL. FT CARBOHYD 1174 1174 POTENTIAL. FT CARBOHYD 2301 2301 POTENTIAL. FT CARBOHYD 2305 2305 POTENTIAL. FT CARBOHYD 2457 2457 POTENTIAL. FT CARBOHYD 2485 2485 POTENTIAL. FT CARBOHYD 2665 2665 POTENTIAL. FT CARBOHYD 2704 2704 POTENTIAL. FT CARBOHYD 2714 2714 POTENTIAL. SQ SEQUENCE 3391 AA; 379878 MW; 70570314 CRC32; MNDQRKEAKN TPFNMLKRER NRVSTVQQLT KRFSLGMLQG RGPLKLYMAL VAFLRFLTIP PTAGILKRWG TIKKSKAINV LRGFRKEIGR MLNILNRRRR SAGMIIMLIP TVMAFHLTTR NGEPHMIVSR QEKGKSLLFK TEVGVNMCTL MAMDLGELCE DTITYKCPLL RQNEPEDIDC WCNSTSTWVT YGTCTTMGEH RREKRSVALV PHVGMGLETR TETWMSSEGA WKHVQRIETW ILRHPGFTMM AAILAYTIGT THFQRALILI LLTAVTPSMT MRCIGMSNRD FVEGVSGGSW VDIVLEHGSC VTTMAKNKPT LDFELIKTEA KQPATLRKYC IEAKLTNTTT ESRCPTQGEP SLNEEQDKRF VCKHSMVDRG WGNGCGLFGK GGIVTCAMFR CKKNMEGKVV QPENLEYTIV ITPHSGEEHA VGNDTGKHGK EIKITPQSSI TEAELTGYGT ITMECSPRTG LDFNEIVLLQ MENKAWLVHR QWFLDLPLPW LPGADTQGSN WIQKETLVTF KNPHAKKQDV VVLGSQEGAM HTALTGATEI QMSSGNLLFT GHLKCRLRMD KLQLKGMSYS MCTGKFKVVK EIAETQHGTI VIRVQYEGDG SPCKIPFEIM DLEKRHVLGR LITVNPIVTE KDSPVNIEAE PPFGDSYIII GVEPGQLKLN WFKKGSSIGQ MFETTMRGAK RMAILGDTAW DFGSLGGVFT SIGKALHQVF GAIYGAAFSG VSWTMKILIG VIITWIGMNS RSTSLSVTLV LVGIVTLYLG VMVQADSGCV VSWKNKELKC GSGIFITDNV HTWTEQYKFQ PESPSKLASA IQKAHEEDIC GIRSVTRLEN LMWKQITPEL NHILSENEVK LTIMTGDIKG IMQAGKRSLR PQPTELKYSW KTWGKAKMLS TESHNQTFFI DGPETAECPN TNRAWNSLEV EDYGFGVFTT NIWLKLKEKQ DVFCDSKLMS AAIKDNRAVH ADMGYWIESA LNDTWKIEKA SFIEVKNCHW PKSHTLWSNG VLESEMIIPK NLAGPVSKHN YRPGYHTQIT GPWHLGKLEM DFDFCDGTTV VVTEDCGNRG PSLRTTTASG KLITEWCCRS CTLPPLRYRG EDGCWYGMEI RPLKEKEENL VNSLVTAGHG QVDNFSLGVL GMALFLEEML RTRVGTKHAI LLVAVSFVTL IIGNRSFRDL GRVMVMVGAT MTDDIGMGVT YLALLAAFKV RPTFAAGLLL RKLTSKELMM TTIGIVLSSQ STIPETILEL TDALALGMMV LKMVRNMEKY QLAVTIMAIL CVPNAVILQN AWKVSCTILA VVSVSPLFLT SSQQKTDWIP LALTIKGLNP TAIFLTTLSR TSKKRSWPLN EAIMAVGMVS ILASSLLKND IPMTGPLVAG GLLTVCYVLT GRSADLELER AADVKWEDQA EISGSSPILS ITISEDGSMS IKNEEEEQTL TILIRRGLLV ISGLFPVSIP ITAAAWYLWE VKKQRAGVLW DVPSPPPMGK AELEDGAYRI KQKGILGYSQ IGAGVYKEGT FHTMWHVTRG AVLMHKGKRI EPSWADVKKD LISYGGGWKL EGEWKEGEEV QVLALDPGKN PRAVQTKPGL FKTNAGTIGA VSLDFSPGTS GSPIIDKKGK VVGLYGNGVV TRSGAYVSAI AQTEKSIEDN PEIEDDIFRK RRLTIMDLHP GAGKTKRYLP AIVREAIKRG LRTLILAPTR VVAAEMEEAL RGLPIRYQTP AIRAEHTGRE IVDLMCHATF TMRLLSPVRV PNYNLIIMDE AHFTDPASIA ARGYISTRVE MGEAAGIFMT ATPPGSRDPF PQSNAPIIDE EREIPERSWN SGHEWVTDFK GKTVWFVPSI KAGNDIAACL RKNGKKVIQL SRKTFDSEYV KTRTNDWDFV VTTDISEMGA NFKAERVIDP RRCMKPVILT DGEERVILAG PMPVTHSSAA QRRGRIGRNP KNENDQYIYM GEPLENDEDC AHWKEAKMLL DNINTPEGII PSMFEPEREK VDAIDGEYRL RGEARTTFVD LMRRGDLPVW LAYRVAAEGI NYADRRWCFD GVKNNQILEE NVEVEIWTKE GERKKLKPRW LDARIYSDPL ALKEFKEFAA GRKSLTLNLI TEMGRLPTFM TQKARNALDN LAVLHTAEAG GRAYNHALSE LPETLETLLL LTLLATVTGG IFLFLMSARG IGKMTLGMCC IITASILLWY AQIQPHWIAA SIILEFFLIV LLIPEPEKQR TPQDNQLTYV VIAILTVVAA TMANEMGFLE KTKKDLGLGS IATQQPESNI LDIDLRPASA WTLYAVATTF VTPMLRHSIE NSSVNVSLTA IANQATVLMG LGKGWPLSKM DIGVPLLAIG CYSQVNPTTL TAALFLLVAH YAIIGPALQA KASREAQKRA AAGIMKNPTV DGITVIDLDP IPYDPKFEKQ LGQVMLLVLC VTQVLMMRTT WALCEALTLA TGPISTLSEG NPGRFWNTTI AVSMANIFRG SYLAGAGLLF SIMKNTTNTR RVTGNIGETL GEKWKSRLNA LGKSEFQIYK KSGIQEVDRT LAKEGIKRGE TDHHAVSRGS AKLRWFVERN MVTPEGKVVD LGCGRGGWSY YCGGLKNVRE VKGLTKGGPG HEEPIPMSTY GWNLVRLQSG VDVFFIPPEK CDTLLCDIGE SSPNPTVEAG RTLRVLNLVE NWLNNNTQFC IKVLNPYMPS VIEKMEALQR KYGGALVRNP LSRNSTHEMY WVSNASGNIV SSVNMISRML INRFTMRYKK ATYEPDVDLG SGTRNIGIES EIPNLDIIGK RIEKIKQEHE TSWHYDQDHP YKTWAYHGSY ETKQTGSASS MVNGVFRLLT KPWDVVPMVT QMAMTDTTPF GQQRVFKEKV DTRTQEPKEG TKKLMKITAE WLWKELGKKK TPRMCTREEF TRKVRSNAAL GAIFTDENKW KSAREAVEDS RFWELVDKER NLHLEGKCET CVYNIMGKRE KKLGEFGKAK GSRAIWYMWL GARFLEFEAL GFLNEDHWFS RENSLSGVEG EGLHKLGYIL RDVSKKEGGA MYADDTAGWD TRITLEDLKN EEMVTNHMEG EHKKLAEAIF KLTYQNKVVR VQRPTPRGTV MDIISRRDQR GSGQVGTYGL NTFTNMEAQL IRQMEGEGVF KSIQHLTITE EIAVQNWLAR VGRERLSRMA ISGDDCVVKP LDDRLPSALT ALNDMGKIRK DIQQWEPSRG WNDWTQVPFC SHHFHELIMK DGRVLVVPCR NQDELIGRAR ISQGAGWSLR ETACLGKSYA QMWSLMYFHR RDLRLAANAI CSAVPSHWVP TSRTTWSIHA KHEWMTTEDM LTVWNRVWIQ ENPWMEDKTP VESWEEIPYL GKREDQWCGS LIGLTSRATW AKNIQAAINQ VRSLIGNEEY TDYMPSMKRF RREEEEAGVL W //