ID   POLG_JAEVN              Reviewed;        1440 AA.
AC   P14403; P08769;
DT   01-JAN-1990, integrated into UniProtKB/Swiss-Prot.
DT   01-JAN-1990, sequence version 1.
DT   10-MAY-2017, entry version 117.
DE   RecName: Full=Genome polyprotein;
DE   Contains:
DE     RecName: Full=Capsid protein C;
DE     AltName: Full=Core protein;
DE   Contains:
DE     RecName: Full=prM;
DE   Contains:
DE     RecName: Full=Peptide pr;
DE   Contains:
DE     RecName: Full=Small envelope protein M;
DE     AltName: Full=Matrix protein;
DE   Contains:
DE     RecName: Full=Envelope protein E;
DE   Contains:
DE     RecName: Full=Non-structural protein 1;
DE              Short=NS1;
DE   Contains:
DE     RecName: Full=Non-structural protein 2A;
DE              Short=NS2A;
DE   Contains:
DE     RecName: Full=Serine protease subunit NS2B;
DE     AltName: Full=Flavivirin protease NS2B regulatory subunit;
DE     AltName: Full=Non-structural protein 2B;
DE   Contains:
DE     RecName: Full=Serine protease NS3;
DE              EC=3.4.21.91;
DE              EC=3.6.1.15;
DE              EC=3.6.4.13;
DE     AltName: Full=Flavivirin protease NS3 catalytic subunit;
DE     AltName: Full=Non-structural protein 3;
DE   Flags: Fragment;
OS   Japanese encephalitis virus (strain Nakayama).
OC   Viruses; ssRNA viruses; ssRNA positive-strand viruses, no DNA stage;
OC   Flaviviridae; Flavivirus; Japanese encephalitis virus group.
OX   NCBI_TaxID=11076;
OH   NCBI_TaxID=8899; Ardeidae (herons).
OH   NCBI_TaxID=9913; Bos taurus (Bovine).
OH   NCBI_TaxID=308713; Culex gelidus.
OH   NCBI_TaxID=7178; Culex tritaeniorhynchus (Mosquito).
OH   NCBI_TaxID=9796; Equus caballus (Horse).
OH   NCBI_TaxID=9606; Homo sapiens (Human).
OH   NCBI_TaxID=9823; Sus scrofa (Pig).
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC RNA].
RX   PubMed=3035787; DOI=10.1016/0042-6822(87)90207-8;
RA   McAda P.C., Mason P.W., Schmaljohn C.S., Dalrymple J.M., Mason T.L.,
RA   Fournier M.J.;
RT   "Partial nucleotide sequence of the Japanese encephalitis virus
RT   genome.";
RL   Virology 158:348-360(1987).
CC   -!- FUNCTION: Capsid protein C self-assembles to form an icosahedral
CC       capsid about 30 nm in diameter. The capsid encapsulates the
CC       genomic RNA (By similarity). {ECO:0000250}.
CC   -!- FUNCTION: prM acts as a chaperone for envelope protein E during
CC       intracellular virion assembly by masking and inactivating envelope
CC       protein E fusion peptide. prM is matured in the last step of
CC       virion assembly, presumably to avoid catastrophic activation of
CC       the viral fusion peptide induced by the acidic pH of the trans-
CC       Golgi network. After cleavage by host furin, the pr peptide is
CC       released in the extracellular medium and small envelope protein M
CC       and envelope protein E homodimers are dissociated (By similarity).
CC       {ECO:0000250}.
CC   -!- FUNCTION: Envelope protein E binding to host cell surface receptor
CC       is followed by virus internalization through clathrin-mediated
CC       endocytosis. Envelope protein E is subsequently involved in
CC       membrane fusion between virion and host late endosomes.
CC       Synthesized as a homodimer with prM which acts as a chaperone for
CC       envelope protein E. After cleavage of prM, envelope protein E
CC       dissociate from small envelope protein M and homodimerizes (By
CC       similarity). {ECO:0000250}.
CC   -!- FUNCTION: Non-structural protein 1 is involved in virus
CC       replication and regulation of the innate immune response.
CC       {ECO:0000250}.
CC   -!- FUNCTION: Non-structural protein 2A may be involved viral RNA
CC       replication and capsid assembly. {ECO:0000305}.
CC   -!- FUNCTION: Non-structural protein 2B is a required cofactor for the
CC       serine protease function of NS3. {ECO:0000255|PROSITE-
CC       ProRule:PRU00859}.
CC   -!- FUNCTION: Serine protease NS3 displays three enzymatic activities:
CC       serine protease, NTPase and RNA helicase. NS3 serine protease, in
CC       association with NS2B, performs its autocleavage and cleaves the
CC       polyprotein at dibasic sites in the cytoplasm: C-prM, NS2A-NS2B,
CC       NS2B-NS3, NS3-NS4A, NS4A-2K and NS4B-NS5. NS3 RNA helicase binds
CC       RNA and unwinds dsRNA in the 3' to 5' direction (By similarity).
CC       {ECO:0000250}.
CC   -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds
CC       in which each of the Xaa can be either Arg or Lys and Yaa can be
CC       either Ser or Ala.
CC   -!- CATALYTIC ACTIVITY: NTP + H(2)O = NDP + phosphate.
CC   -!- CATALYTIC ACTIVITY: ATP + H(2)O = ADP + phosphate.
CC   -!- SUBUNIT: Capsid protein C forms homodimers. prM and envelope
CC       protein E form heterodimers in the endoplasmic reticulum and
CC       Golgi. In immature particles, there are 60 icosaedrally organized
CC       trimeric spikes on the surface. Each spike consists of three
CC       heterodimers of envelope protein M precursor (prM) and envelope
CC       protein E. NS1 forms homodimers as well as homohexamers when
CC       secreted. NS1 may interact with NS4A. NS3 and NS2B form a
CC       heterodimer. NS3 is the catalytic subunit, whereas NS2B strongly
CC       stimulates the latter, acting as a cofactor. In the absence of the
CC       NS2B, NS3 protease is unfolded and inactive (By similarity).
CC       {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: Capsid protein C: Virion {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: Peptide pr: Secreted {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: Small envelope protein M: Virion membrane
CC       {ECO:0000250}; Multi-pass membrane protein {ECO:0000250}. Host
CC       endoplasmic reticulum membrane {ECO:0000250}; Multi-pass membrane
CC       protein {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: Envelope protein E: Virion membrane
CC       {ECO:0000250}; Multi-pass membrane protein {ECO:0000250}. Host
CC       endoplasmic reticulum membrane {ECO:0000250}; Multi-pass membrane
CC       protein {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: Non-structural protein 1: Secreted. Host
CC       endoplasmic reticulum membrane {ECO:0000250}; Peripheral membrane
CC       protein {ECO:0000250}; Lumenal side {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: Non-structural protein 2A: Host endoplasmic
CC       reticulum membrane {ECO:0000305}; Multi-pass membrane protein
CC       {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: Serine protease subunit NS2B: Host
CC       endoplasmic reticulum membrane {ECO:0000305}; Multi-pass membrane
CC       protein {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: Serine protease NS3: Host endoplasmic
CC       reticulum membrane {ECO:0000250}; Peripheral membrane protein
CC       {ECO:0000250}; Cytoplasmic side {ECO:0000250}. Note=Remains non-
CC       covalently associated to NS3 protease. {ECO:0000250}.
CC   -!- PTM: Specific enzymatic cleavages in vivo by the viral protease
CC       NS3 and host cell enzymes yield mature proteins. The nascent
CC       protein C contains a C-terminal hydrophobic domain that act as a
CC       signal sequence for translocation of prM into the lumen of the ER.
CC       Mature soluble protein C is released after cleavage by NS3
CC       protease at a site upstream of this hydrophobic domain. prM is
CC       cleaved in post-Golgi vesicles by a host furin, releasing the
CC       mature small envelope protein M, and peptide pr (By similarity).
CC       {ECO:0000250}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; M16574; AAA46251.1; -; Genomic_RNA.
DR   PIR; A27844; GNWVJF.
DR   PDB; 4R8T; X-ray; 2.13 A; A=1352-1369.
DR   PDBsum; 4R8T; -.
DR   ProteinModelPortal; P14403; -.
DR   SMR; P14403; -.
DR   PRIDE; P14403; -.
DR   GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR   GO; GO:0044167; C:host cell endoplasmic reticulum membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0016021; C:integral component of membrane; IEA:UniProtKB-KW.
DR   GO; GO:0019028; C:viral capsid; IEA:UniProtKB-KW.
DR   GO; GO:0019031; C:viral envelope; IEA:UniProtKB-KW.
DR   GO; GO:0055036; C:virion membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW.
DR   GO; GO:0003725; F:double-stranded RNA binding; IEA:InterPro.
DR   GO; GO:0004386; F:helicase activity; IEA:UniProtKB-KW.
DR   GO; GO:0046983; F:protein dimerization activity; IEA:InterPro.
DR   GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro.
DR   GO; GO:0005198; F:structural molecule activity; IEA:InterPro.
DR   GO; GO:0075512; P:clathrin-dependent endocytosis of virus by host cell; IEA:UniProtKB-KW.
DR   GO; GO:0039654; P:fusion of virus membrane with host endosome membrane; IEA:UniProtKB-KW.
DR   GO; GO:0019062; P:virion attachment to host cell; IEA:UniProtKB-KW.
DR   CDD; cd12149; Flavi_E_C; 1.
DR   Gene3D; 2.60.40.350; -; 1.
DR   Gene3D; 3.30.387.10; -; 1.
DR   Gene3D; 3.30.67.10; -; 1.
DR   InterPro; IPR000069; Env_glycoprot_M_flavivir.
DR   InterPro; IPR013755; Flav_gly_cen_dom_subdom1.
DR   InterPro; IPR001122; Flavi_capsidC.
DR   InterPro; IPR027287; Flavi_E_Ig-like.
DR   InterPro; IPR026470; Flavi_E_Stem/Anchor_dom.
DR   InterPro; IPR001157; Flavi_NS1.
DR   InterPro; IPR000752; Flavi_NS2A.
DR   InterPro; IPR000487; Flavi_NS2B.
DR   InterPro; IPR002535; Flavi_propep.
DR   InterPro; IPR000336; Flavivir/Alphavir_Ig-like.
DR   InterPro; IPR011998; Glycoprot_cen/dimer.
DR   InterPro; IPR013756; GlyE_cen_dom_subdom2.
DR   InterPro; IPR014756; Ig_E-set.
DR   Pfam; PF01003; Flavi_capsid; 1.
DR   Pfam; PF02832; Flavi_glycop_C; 1.
DR   Pfam; PF00869; Flavi_glycoprot; 1.
DR   Pfam; PF01004; Flavi_M; 1.
DR   Pfam; PF00948; Flavi_NS1; 1.
DR   Pfam; PF01005; Flavi_NS2A; 1.
DR   Pfam; PF01002; Flavi_NS2B; 1.
DR   Pfam; PF01570; Flavi_propep; 1.
DR   SUPFAM; SSF56983; SSF56983; 1.
DR   SUPFAM; SSF81296; SSF81296; 1.
DR   TIGRFAMs; TIGR04240; flavi_E_stem; 1.
DR   PROSITE; PS51527; FLAVIVIRUS_NS2B; 1.
PE   1: Evidence at protein level;
KW   3D-structure; ATP-binding; Capsid protein;
KW   Clathrin-mediated endocytosis of virus by host;
KW   Cleavage on pair of basic residues; Disulfide bond;
KW   Fusion of virus membrane with host endosomal membrane;
KW   Fusion of virus membrane with host membrane; Glycoprotein; Helicase;
KW   Host endoplasmic reticulum; Host membrane; Host-virus interaction;
KW   Hydrolase; Membrane; Multifunctional enzyme; Nucleotide-binding;
KW   Protease; Secreted; Serine protease; Transmembrane;
KW   Transmembrane helix; Viral attachment to host cell;
KW   Viral envelope protein; Viral penetration into host cytoplasm; Virion;
KW   Virus endocytosis by host; Virus entry into host cell.
FT   CHAIN        <1  >1440       Genome polyprotein.
FT                                /FTId=PRO_0000405200.
FT   CHAIN        <1     31       Capsid protein C. {ECO:0000250}.
FT                                /FTId=PRO_0000037869.
FT   PROPEP       32     53       ER anchor for the protein C, removed in
FT                                mature form by serine protease NS3.
FT                                {ECO:0000250}.
FT                                /FTId=PRO_0000405201.
FT   CHAIN        54    222       prM. {ECO:0000250}.
FT                                /FTId=PRO_0000405202.
FT   CHAIN        54    146       Peptide pr. {ECO:0000250}.
FT                                /FTId=PRO_0000037870.
FT   CHAIN       147    222       Small envelope protein M. {ECO:0000250}.
FT                                /FTId=PRO_0000037871.
FT   CHAIN       223    722       Envelope protein E. {ECO:0000250}.
FT                                /FTId=PRO_0000037872.
FT   CHAIN       723   1074       Non-structural protein 1. {ECO:0000250}.
FT                                /FTId=PRO_0000037873.
FT   CHAIN      1075   1301       Non-structural protein 2A. {ECO:0000250}.
FT                                /FTId=PRO_0000037874.
FT   CHAIN      1302   1432       Serine protease subunit NS2B.
FT                                {ECO:0000250}.
FT                                /FTId=PRO_0000037875.
FT   CHAIN      1433  >1440       Serine protease NS3. {ECO:0000250}.
FT                                /FTId=PRO_0000037876.
FT   TRANSMEM     36     56       Helical. {ECO:0000255}.
FT   TOPO_DOM     57    180       Extracellular. {ECO:0000255}.
FT   TRANSMEM    181    201       Helical. {ECO:0000255}.
FT   TOPO_DOM    202    207       Cytoplasmic. {ECO:0000255}.
FT   TRANSMEM    208    222       Helical. {ECO:0000255}.
FT   TOPO_DOM    223    674       Extracellular. {ECO:0000255}.
FT   INTRAMEM    675    695       Helical. {ECO:0000255}.
FT   TOPO_DOM    696    701       Extracellular. {ECO:0000255}.
FT   INTRAMEM    702    722       Helical. {ECO:0000255}.
FT   TOPO_DOM    723   1073       Extracellular. {ECO:0000255}.
FT   TRANSMEM   1074   1094       Helical. {ECO:0000255}.
FT   TOPO_DOM   1095   1105       Cytoplasmic. {ECO:0000255}.
FT   TRANSMEM   1106   1126       Helical. {ECO:0000255}.
FT   TOPO_DOM   1127   1147       Lumenal. {ECO:0000255}.
FT   TRANSMEM   1148   1168       Helical. {ECO:0000255}.
FT   TOPO_DOM   1169   1178       Cytoplasmic. {ECO:0000255}.
FT   TRANSMEM   1179   1199       Helical. {ECO:0000255}.
FT   TOPO_DOM   1200   1200       Lumenal. {ECO:0000255}.
FT   TRANSMEM   1201   1221       Helical. {ECO:0000255}.
FT   TOPO_DOM   1222   1302       Cytoplasmic. {ECO:0000255}.
FT   TRANSMEM   1303   1323       Helical. {ECO:0000255}.
FT   TOPO_DOM   1324   1326       Lumenal. {ECO:0000255}.
FT   TRANSMEM   1327   1347       Helical. {ECO:0000255}.
FT   TOPO_DOM   1348   1404       Cytoplasmic. {ECO:0000255}.
FT   INTRAMEM   1405   1425       Helical. {ECO:0000255}.
FT   TOPO_DOM   1426  >1440       Cytoplasmic. {ECO:0000255}.
FT   REGION     1355   1394       Interacts with and activates NS3
FT                                protease. {ECO:0000255|PROSITE-
FT                                ProRule:PRU00859}.
FT   SITE         31     32       Cleavage; by viral protease NS3.
FT                                {ECO:0000255}.
FT   SITE         53     54       Cleavage; by host signal peptidase.
FT                                {ECO:0000250}.
FT   SITE        146    147       Cleavage; by host furin. {ECO:0000250}.
FT   SITE        222    223       Cleavage; by host signal peptidase.
FT                                {ECO:0000255}.
FT   SITE        722    723       Cleavage; by host signal peptidase.
FT                                {ECO:0000255}.
FT   SITE       1074   1075       Cleavage; by host. {ECO:0000250}.
FT   SITE       1301   1302       Cleavage; by viral protease NS3.
FT                                {ECO:0000255}.
FT   SITE       1432   1433       Cleavage; by autolysis. {ECO:0000255}.
FT   CARBOHYD     68     68       N-linked (GlcNAc...) asparagine; by host.
FT                                {ECO:0000255}.
FT   CARBOHYD    376    376       N-linked (GlcNAc...) asparagine; by host.
FT                                {ECO:0000255}.
FT   CARBOHYD    852    852       N-linked (GlcNAc...) asparagine; by host.
FT                                {ECO:0000255}.
FT   CARBOHYD    929    929       N-linked (GlcNAc...) asparagine; by host.
FT                                {ECO:0000255}.
FT   DISULFID    225    252       {ECO:0000250}.
FT   DISULFID    282    338       {ECO:0000250}.
FT   DISULFID    296    327       {ECO:0000250}.
FT   DISULFID    314    343       {ECO:0000250}.
FT   DISULFID    412    509       {ECO:0000250}.
FT   DISULFID    526    557       {ECO:0000250}.
FT   NON_TER       1      1
FT   NON_TER    1440   1440
FT   STRAND     1352   1359       {ECO:0000244|PDB:4R8T}.
FT   HELIX      1365   1368       {ECO:0000244|PDB:4R8T}.
SQ   SEQUENCE   1440 AA;  158185 MW;  4D489A365A3C2E6E CRC64;
     SVAMKHLTSF KRELGTLIDA VNKRGRKQNK RGGNEGSIMW LASLAVVIAC AGAMKLSNFQ
     GKLLMTVNNT DIADVIVIPN PSKGENRCWV RAIDVGYMCE DTITYECPKL TMGNDPEDVD
     CWCDNQEVYV QYGRCTRTRH SKRSRRSVSV QTHGESSLVN KKEAWLDSTK ATRYLMKTEN
     WIVRNPGYAF LAAILGWMLG SNNGQRRWYF TILLLLVAPA YSFNCLGMGN RDFIEGASGA
     TWVDLVLEGD SCLTIMANDK PTLDVRMINI EAVQLAEVRS YCYHASVTDI STVARCPTTG
     EAHNEKRADS SYVCKQGFTD RGWGNGCGLF GKGSIDTCAK FSCTSKAIGR TIQPENIKYE
     VGIFVHGTTT SENHGNYSAQ VGASQAAKFT VTPNAPSITL KLGDYGEVTL DCEPRSGLNT
     EAFYVMTVGS KSFLVHREWF HDLALPWTPP SSTAWRNREL LMEFEEAHAT KQSVVALGSQ
     EGGLHQALAG AIVVEYSSSV KLTSGHLKCR LKMDKLALKG TTYGMCTEKF SFAKNPADTG
     HGTVVIELSY SGSDGPCKIP IVSVASLNDM TPVGRLVTVN PFVATSSANS KVLVEMEPPF
     GDSYIVVGRG DKQINHHWHK AGSTLGKAFS TTLKGAQRLA ALGDTAWDFG SIGGVFNSIG
     KAVHQVFGGA FRTLFGGMSW ITQGLMGALL LWMGVNARDR SIALAFLATG GVLVFLATNV
     HADTGCAIDI TRKEMRCGSG IFVHNDVEAW VDRYKYLPET PRSLAKIVHK AHKEGVCGVR
     SVTRLEHQMW EAVRDELNVL LKENAVDLSV VVNKPVGRYR SAPKRLSMTQ EKFEMGWKAW
     GKSILFAPEL ANSTFVVDGP ETKECPDEHR AWNSIEIEDF GFGITSTRVW LKIREESTDE
     CDGAIIGTAV KGHVAVHSDL SYWIESRYND TWKLERAVFG EVKSCTWPET HTLWGDGVEE
     SELIIPHTIA GPKSKHNRRE GYKTQNQGPW DENGIVLDFD YCPGTKVTIT EDCGKRGPSV
     RTTTDSGKLI TDWCCRSCSL PPLRFRTENG CWYGMEIRPV RHDETTLVRS QVDAFNGEMV
     DPFQLGLLVM FLATQEVLRK RWTARLTIPA VLGALLVLML GGITYTDLAR YVVLVAAAFA
     EANSGGDVLH LALIAVFKIQ PAFLVMNMLS TRWTNQENVV LVLGAAFFHL ASVDLQIGVH
     GILNAAAIAW MIVRAITFPT TSSVTMPVLA LLTPGMRALY LDTYRIILLV IGICSLLQER
     KKTMAKKKGA VLLGLALTST GWFSPTTIAA GLMVCNPNKK RGWPATEFLS AVGLMFAIVG
     GLAELDIESM SIPFMLAGLM AVSYVVSGKA TDMWLERAAD ISWEMDAAIT GSSRRLDVKL
     DDDGDFHLID DPGVPWKVWV LRMSCIGLAA LTPWAIVPAA FGYWLTLKTT KRGGVFWDTP
//