ID POLG_JAEVN Reviewed; 1440 AA. AC P14403; P08769; DT 01-JAN-1990, integrated into UniProtKB/Swiss-Prot. DT 01-JAN-1990, sequence version 1. DT 22-SEP-2009, entry version 73. DE RecName: Full=Genome polyprotein; DE Contains: DE RecName: Full=Protein C; DE AltName: Full=Core protein; DE AltName: Full=Capsid protein; DE Contains: DE RecName: Full=Small envelope protein M; DE AltName: Full=Matrix protein; DE Contains: DE RecName: Full=Envelope protein E; DE Contains: DE RecName: Full=Non-structural protein 1; DE Short=NS1; DE Contains: DE RecName: Full=Non-structural protein 2A; DE Short=NS2A; DE Contains: DE RecName: Full=Flavivirin protease NS2B regulatory subunit; DE Contains: DE RecName: Full=Flavivirin protease NS3 catalytic subunit; DE EC=3.4.21.91; DE Flags: Fragment; OS Japanese encephalitis virus (strain Nakayama). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus; Japanese encephalitis virus group. OX NCBI_TaxID=11076; OH NCBI_TaxID=8899; Ardeidae (herons). OH NCBI_TaxID=9913; Bos taurus (Bovine). OH NCBI_TaxID=308713; Culex gelidus. OH NCBI_TaxID=7178; Culex tritaeniorhynchus (Mosquito). OH NCBI_TaxID=9796; Equus caballus (Horse). OH NCBI_TaxID=9606; Homo sapiens (Human). OH NCBI_TaxID=9823; Sus scrofa (Pig). RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=87236200; PubMed=3035787; DOI=10.1016/0042-6822(87)90207-8; RA McAda P.C., Mason P.W., Schmaljohn C.S., Dalrymple J.M., Mason T.L., RA Fournier M.J.; RT "Partial nucleotide sequence of the Japanese encephalitis virus RT genome."; RL Virology 158:348-360(1987). CC -!- FUNCTION: The small proteins NS2A, NS4A and NS4B are hydrophobic, CC suggesting a possible membrane-related function. NS5 may play a CC role in the viral RNA replication. The NS2B/NS3 protease complex CC processes the viral polyprotein. CC -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds CC in which each of the Xaa can be either Arg or Lys and Yaa can be CC either Ser or Ala. CC -!- SUBUNIT: NS3 and NS2B form a heterodimer. NS3 is the catalytic CC subunit, whereas NS2B strongly stimulates the latter (By CC similarity). CC -!- SUBCELLULAR LOCATION: Protein C: Virion (Potential). Host CC membrane; Single-pass membrane protein (Potential). CC -!- SUBCELLULAR LOCATION: Small envelope protein M: Virion CC (Potential). Host membrane; Single-pass membrane protein CC (Potential). CC -!- SUBCELLULAR LOCATION: Envelope protein E: Virion (Potential). Host CC membrane; Multi-pass membrane protein (Potential). CC -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins CC (By similarity). CC -!- MISCELLANEOUS: The virion of this virus is a nucleocapsid covered CC by a lipoprotein envelope. The envelope contains two proteins: the CC protein M and glycoprotein E. The nucleocapsid is a complex of CC protein C and mRNA. In immature particles, there are 60 CC icosaedrally organized trimeric spikes on the surface. Each spike CC consists of three heterodimers of envelope protein M precursor CC (prM) and envelope protein E (By similarity). CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M16574; AAA46251.1; -; Genomic_RNA. DR PIR; A27844; GNWVJF. DR HSSP; Q88653; 1OKE. DR SMR; P14403; 223-621. DR GO; GO:0016021; C:integral to membrane; IEA:UniProtKB-KW. DR GO; GO:0019031; C:viral envelope; IEA:UniProtKB-KW. DR GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW. DR GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW. DR GO; GO:0003725; F:double-stranded RNA binding; IEA:InterPro. DR GO; GO:0004386; F:helicase activity; IEA:UniProtKB-KW. DR GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro. DR GO; GO:0005198; F:structural molecule activity; IEA:UniProtKB-KW. DR GO; GO:0019058; P:viral infectious cycle; IEA:InterPro. DR InterPro; IPR000069; Env_glycoprot_M_flavivir. DR InterPro; IPR013756; Flav_glyE_cen_2. DR InterPro; IPR013754; Flav_glyE_dim. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR000336; Flv_glyE_Ig-like. DR InterPro; IPR011999; GlycoprotE_cen/dimer_Flavivir. DR Gene3D; G3DSA:3.30.67.10; Flav_glyE_cen_2; 1. DR Gene3D; G3DSA:2.60.98.10; Flav_glyE_dim; 1. DR Gene3D; G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01570; Flavi_propep; 1. DR ProDom; PD001496; Flavi_NS1; 1. PE 3: Inferred from homology; KW ATP-binding; Capsid protein; Cleavage on pair of basic residues; KW Core protein; Disulfide bond; Envelope protein; Glycoprotein; KW Helicase; Hydrolase; Membrane; Nucleotide-binding; Transmembrane; KW Virion. FT CHAIN <1 53 Protein C. FT /FTId=PRO_0000037869. FT PROPEP 54 146 FT /FTId=PRO_0000037870. FT CHAIN 147 222 Small envelope protein M. FT /FTId=PRO_0000037871. FT CHAIN 223 794 Envelope protein E. FT /FTId=PRO_0000037872. FT CHAIN 795 1136 Non-structural protein 1. FT /FTId=PRO_0000037873. FT CHAIN 1137 1301 Non-structural protein 2A. FT /FTId=PRO_0000037874. FT CHAIN 1302 1432 Flavivirin protease NS2B regulatory FT subunit. FT /FTId=PRO_0000037875. FT CHAIN 1433 >1440 Flavivirin protease NS3 catalytic FT subunit. FT /FTId=PRO_0000037876. FT TRANSMEM 36 56 Potential. FT TRANSMEM 181 201 Potential. FT TRANSMEM 208 224 Potential. FT TRANSMEM 675 695 Potential. FT TRANSMEM 702 722 Potential. FT TRANSMEM 1106 1126 Potential. FT TRANSMEM 1148 1168 Potential. FT TRANSMEM 1179 1199 Potential. FT TRANSMEM 1201 1221 Potential. FT TRANSMEM 1238 1258 Potential. FT TRANSMEM 1270 1290 Potential. FT TRANSMEM 1303 1323 Potential. FT TRANSMEM 1327 1347 Potential. FT TRANSMEM 1405 1425 Potential. FT CARBOHYD 68 68 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 376 376 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 852 852 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 929 929 N-linked (GlcNAc...); by host FT (Potential). FT DISULFID 225 252 By similarity. FT DISULFID 282 338 By similarity. FT DISULFID 296 327 By similarity. FT DISULFID 314 343 By similarity. FT DISULFID 412 509 By similarity. FT DISULFID 526 557 By similarity. FT NON_TER 1 1 FT NON_TER 1440 1440 SQ SEQUENCE 1440 AA; 158185 MW; 4D489A365A3C2E6E CRC64; SVAMKHLTSF KRELGTLIDA VNKRGRKQNK RGGNEGSIMW LASLAVVIAC AGAMKLSNFQ GKLLMTVNNT DIADVIVIPN PSKGENRCWV RAIDVGYMCE DTITYECPKL TMGNDPEDVD CWCDNQEVYV QYGRCTRTRH SKRSRRSVSV QTHGESSLVN KKEAWLDSTK ATRYLMKTEN WIVRNPGYAF LAAILGWMLG SNNGQRRWYF TILLLLVAPA YSFNCLGMGN RDFIEGASGA TWVDLVLEGD SCLTIMANDK PTLDVRMINI EAVQLAEVRS YCYHASVTDI STVARCPTTG EAHNEKRADS SYVCKQGFTD RGWGNGCGLF GKGSIDTCAK FSCTSKAIGR TIQPENIKYE VGIFVHGTTT SENHGNYSAQ VGASQAAKFT VTPNAPSITL KLGDYGEVTL DCEPRSGLNT EAFYVMTVGS KSFLVHREWF HDLALPWTPP SSTAWRNREL LMEFEEAHAT KQSVVALGSQ EGGLHQALAG AIVVEYSSSV KLTSGHLKCR LKMDKLALKG TTYGMCTEKF SFAKNPADTG HGTVVIELSY SGSDGPCKIP IVSVASLNDM TPVGRLVTVN PFVATSSANS KVLVEMEPPF GDSYIVVGRG DKQINHHWHK AGSTLGKAFS TTLKGAQRLA ALGDTAWDFG SIGGVFNSIG KAVHQVFGGA FRTLFGGMSW ITQGLMGALL LWMGVNARDR SIALAFLATG GVLVFLATNV HADTGCAIDI TRKEMRCGSG IFVHNDVEAW VDRYKYLPET PRSLAKIVHK AHKEGVCGVR SVTRLEHQMW EAVRDELNVL LKENAVDLSV VVNKPVGRYR SAPKRLSMTQ EKFEMGWKAW GKSILFAPEL ANSTFVVDGP ETKECPDEHR AWNSIEIEDF GFGITSTRVW LKIREESTDE CDGAIIGTAV KGHVAVHSDL SYWIESRYND TWKLERAVFG EVKSCTWPET HTLWGDGVEE SELIIPHTIA GPKSKHNRRE GYKTQNQGPW DENGIVLDFD YCPGTKVTIT EDCGKRGPSV RTTTDSGKLI TDWCCRSCSL PPLRFRTENG CWYGMEIRPV RHDETTLVRS QVDAFNGEMV DPFQLGLLVM FLATQEVLRK RWTARLTIPA VLGALLVLML GGITYTDLAR YVVLVAAAFA EANSGGDVLH LALIAVFKIQ PAFLVMNMLS TRWTNQENVV LVLGAAFFHL ASVDLQIGVH GILNAAAIAW MIVRAITFPT TSSVTMPVLA LLTPGMRALY LDTYRIILLV IGICSLLQER KKTMAKKKGA VLLGLALTST GWFSPTTIAA GLMVCNPNKK RGWPATEFLS AVGLMFAIVG GLAELDIESM SIPFMLAGLM AVSYVVSGKA TDMWLERAAD ISWEMDAAIT GSSRRLDVKL DDDGDFHLID DPGVPWKVWV LRMSCIGLAA LTPWAIVPAA FGYWLTLKTT KRGGVFWDTP //