ID   Q9DTF0_9HEPC            Unreviewed;      3010 AA.
AC   Q9DTF0;
DT   01-MAR-2001, integrated into UniProtKB/TrEMBL.
DT   01-MAR-2001, sequence version 1.
DT   06-MAR-2007, entry version 32.
DE   Polyprotein.
OS   Hepatitis C virus.
OC   Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae;
OC   Hepacivirus.
OX   NCBI_TaxID=11103;
RN   [1]
RP   NUCLEOTIDE SEQUENCE.
RC   TISSUE=Serum;
RX   PubMed=11348851; DOI=10.1016/S1386-6346(00)00141-8;
RA   Takahashi K., Iwata K., Matsumoto M., Matsumoto H., Nakao K.,
RA   Hatahara T., Ohta Y., Kanai K., Maruo H., Baba K., Hijikata M.,
RA   Mishiro S.;
RT   "Hepatitis C virus (HCV) genotype 1b sequences from fifteen patients
RT   with hepatocellular carcinoma: the 'progression score' revisited.";
RL   Hepatol. Res. 20:161-171(2001).
RN   [2]
RP   NUCLEOTIDE SEQUENCE.
RX   MEDLINE=92044457; PubMed=1658209;
RA   Oshima M., Tsuchiya M., Yagasaki M., Orita T., Hasegawa M.,
RA   Tomonoh K., Kojima T., Hirata Y., Yamamoto O., Sho Y.;
RT   "cDNA clones of Japanese hepatitis C virus genomes derived from a
RT   single patient show sequence heterogeneity.";
RL   J. Gen. Virol. 72:2805-2809(1991).
CC   -!- FUNCTION: Core protein packages viral RNA to form a viral
CC       nucleocapsid, and promotes virion budding. Modulates viral
CC       translation initiation by interacting with HCV IRES and 40S
CC       ribosomal subunit. Also regulates many host cellular functions
CC       such as signaling pathways and apoptosis. Prevents the
CC       establishment of cellular antiviral state by blocking the
CC       interferon-alpha/beta (IFN-alpha/beta) and IFN-gamma signaling
CC       pathways and by inducing human STAT1 degradation. Thought to play
CC       a role in virus-mediated cell transformation leading to
CC       hepatocellular carcinomas. Interacts with, and activates STAT3
CC       leading to cellular transformation. May repress the promoter of
CC       p53, and sequester CREB3 and SP110 isoform 3/Sp110b in the
CC       cytoplasm. Also represses cell cycle negative regulating factor
CC       CDKN1A, thereby interrupting an important check point of normal
CC       cell cycle regulation. Targets transcription factors involved in
CC       the regulation of inflammatory responses and in the immune
CC       response: suppresses NK-kappaB activation, and activates AP-1.
CC       Could mediate apoptotic pathways through association with TNF-type
CC       receptors TNFRSF1A and LTBR, although its effect on death
CC       receptor-induced apoptosis remains controversial. Enhances TRAIL
CC       mediated apoptosis, suggesting that it might play a role in
CC       immune-mediated liver cell injury. Seric core protein is able to
CC       bind C1QR1 at the T-cell surface, resulting in down-regulation of
CC       T-lymphocytes proliferation. May transactivate human MYC, Rous
CC       sarcoma virus LTR, and SV40 promoters. May suppress the human FOS
CC       and HIV-1 LTR activity. May alter lipid metabolism by interacting
CC       with hepatocellular proteins involved in lipid accumulation and
CC       storage (By similarity).
CC   -!- FUNCTION: Envelope glycoproteins E1 and E2 are involved in virus
CC       attachment to the host cell as well as in virus endocytosis and
CC       fusion with host membrane. E2 inhibits human EIF2AK2/PKR
CC       activation, preventing the establishment of an antiviral state. E2
CC       is a viral ligand for CD209/DC-SIGN and CLEC4M/DC-SIGNR, which are
CC       respectively found on dendritic cells (DCs), and on liver
CC       sinusoidal endothelial cells and macrophage-like cells of lymph
CC       node sinuses. These interactions allow capture of circulating HCV
CC       particles by these cells and subsequent transmission to permissive
CC       cells. DCs are professional antigen presenting cells, critical for
CC       host immunity by inducing specific immune responses against a
CC       broad variety of pathogens. They act as sentinels in various
CC       tissues where they entrap pathogens and convey them to local
CC       lymphoid tissue or lymph node for establishment of immunity.
CC       Capture of circulating HCV particles by these SIGN+ cells may
CC       facilitate virus infection of proximal hepatocytes and lymphocyte
CC       subpopulations and may be essential for the establishment of
CC       persistent infection (By similarity).
CC   -!- FUNCTION: NS3 displays three enzymatic activities: serine
CC       protease, NTPase and RNA helicase. NS3 serine protease, in
CC       association with NS4A, is responsible for the cleavages of NS3-
CC       NS4A, NS4A-NS4B, NS4B-NS5A and NS5A-NS5B. NS3/NS4A complex also
CC       prevents phosphorylation of human IRF3, thus preventing the
CC       establishment of dsRNA induced antiviral state. NS3 RNA helicase
CC       binds to RNA and unwinds dsRNA in the 3' to 5' direction, and
CC       likely RNA stable secondary structure in the template strand (By
CC       similarity).
CC   -!- FUNCTION: NS4B induces a specific membrane alteration that serves
CC       as a scaffold for the virus replication complex. This membrane
CC       alteration gives rise to the so-called ER-derived membranous web
CC       that contains the replication complex (By similarity).
CC   -!- FUNCTION: NS5A is a component of the replication complex involved
CC       in RNA-binding. Down-regulates viral IRES translation initiation.
CC       Mediates interferon resistance, presumably by interacting with and
CC       inhibiting human EIF2AK2/PKR. The hyperphosphorylated form of NS5A
CC       is an inhibitor of viral replication (By similarity).
CC   -!- FUNCTION: NS5B is a RNA-dependent RNA polymerase that plays an
CC       essential role in the virus replication (By similarity).
CC   -!- FUNCTION: P7 seems to be a polymeric ion channel protein
CC       (viroporin) and is inhibited by the antiviral drug amantadine.
CC       Also inhibited by long-alkyl-chain iminosugar derivatives.
CC       Essential for infectivity (By similarity).
CC   -!- FUNCTION: Protease NS2-3 is a cysteine protease responsible for
CC       the autocatalytic cleavage of NS2-NS3. Seems to undergo self-
CC       inactivation following maturation (By similarity).
CC   -!- CATALYTIC ACTIVITY: Hydrolysis of four peptide bonds in the viral
CC       precursor polyprotein, commonly with Asp or Glu in the P6
CC       position, Cys or Thr in P1 and Ser or Ala in P1'.
CC   -!- CATALYTIC ACTIVITY: N nucleoside triphosphate = N diphosphate +
CC       {RNA}(N).
CC   -!- CATALYTIC ACTIVITY: NTP + H(2)O = NDP + phosphate.
CC   -!- SUBUNIT: Core protein is a homomultimer that binds the C-terminal
CC       part of E1. Interacts with numerous cellular proteins. Interacts
CC       with human STAT1, inducing its degradation and human STAT3,
CC       constitutively activating it. Associates with human LTBR and
CC       TNFRSF1A receptors and possibly induces apoptosis. Binds to human
CC       SP110 isoform 3/Sp110b, HNRPK, C1QR1, YWHAE, DDX3X, APOA2 and RXRA
CC       proteins. Interacts with human CREB3 nuclear transcription
CC       protein, triggering cell transformation. May interact with human
CC       p53. Also binds human cytokeratins KRT8, KRT18, KRT19 and VIM
CC       (vimentin). E1 and E2 glycoproteins form a heterodimer that binds
CC       to human LDLR, CD81 and SCARB1 receptors, but this binding is not
CC       sufficient for infection, some additional liver specific cofactors
CC       may be needed. E2 binds and inhibits human EIF2AK2/PKR. Also binds
CC       human CD209/DC-SIGN and CLEC4M/DC-SIGNR. p7 forms a homopolymer.
CC       NS2 forms a homodimer and seems to interact with all other
CC       nonstructural (NS) proteins. NS4A interacts with NS3 serine
CC       protease and stabilizes its folding. NS3-NS4A complex is essential
CC       for the activation of the latter and allows membrane anchorage of
CC       NS3. NS3 interacts with TANK-binding kinase TBK1. NS4B and NS5A
CC       form homodimers and seem to interact with all other nonstructural
CC       (NS) proteins. NS5A also interacts with human EIF2AK2/PKR, GRB2,
CC       PIK3R1 and with most Src-family kinases. NS5B is a homooligomer
CC       (By similarity).
CC   -!- SUBCELLULAR LOCATION: Note=The virion assembly and budding occurs
CC       from the endoplasmic reticulum (ER) membrane or ER-derived
CC       membranes. The C-terminal transmembrane domain of core protein
CC       contains an ER signal leading the nascent polyprotein to the ER
CC       membrane. The C-termini of E1, E2, and p7 membrane domains act as
CC       signal sequences. After cleavage by host signal peptidase, these
CC       membrane sequences retain at the C-terminus of the concerned
CC       proteins (core, E1, E2 and p7), serving as ER membrane anchors.
CC       Core protein is cytoplasmic. It is also located on mitochondrial
CC       and ER membranes, as well as at the surface of lipid droplets. A
CC       minor proportion is present in the nucleus. An unknown proportion
CC       is secreted. E1, E2, NS2 and NS4B are integral ER membrane
CC       proteins. The C-terminal transmembrane domains of envelope
CC       glycoproteins E1 and E2 form a hairpin structure before cleavage
CC       by host signal peptidase. A reorientation of the second
CC       hydrophobic stretch occurs after cleavage producing a single
CC       reoriented transmembrane domain. These events explain the final
CC       topology of these proteins. ER retention of E1 and E2 is leaky
CC       and, in overexpression conditions, a small fraction of both
CC       proteins reaches the plasma membrane. NS3 is associated to the ER
CC       membrane through its binding to NS4A. Membrane insertion of the
CC       membrane-anchored proteins NS4A, NS5A and NS5B occurs after
CC       processing by the NS3 protease. NS5A is perinuclear. A fraction of
CC       p7 localizes to the plasma membrane (By similarity).
CC   -!- DOMAIN: The N-terminal one-third of serine protease NS3 contains
CC       the protease activity. This region contains a zinc atom that does
CC       not belong to the active site, but may play a structural rather
CC       than a catalytic role. This region is essential for the activity
CC       of protease NS2-3, maybe by contributing to the folding of the
CC       latter. The helicase activity is located in the C-terminus of NS3
CC       (By similarity).
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; AB049087; BAB18800.1; -; Genomic_RNA.
DR   PIR; A61196; A61196.
DR   PIR; PQ0246; PQ0246.
DR   PIR; PQ0251; PQ0251.
DR   PIR; PQ0252; PQ0252.
DR   PIR; PQ0254; PQ0254.
DR   PIR; PS0329; PS0329.
DR   HSSP; Q8JYS1; 1CWX.
DR   SMR; Q9DTF0; 902-1026, 1029-1657, 2008-2170, 2420-2949.
DR   euHCVdb; AB049087; -.
DR   GO; GO:0005783; C:endoplasmic reticulum; IEA:UniProtKB-KW.
DR   GO; GO:0016021; C:integral to membrane; IEA:UniProtKB-KW.
DR   GO; GO:0030529; C:ribonucleoprotein complex; IEA:UniProtKB-KW.
DR   GO; GO:0019028; C:viral capsid; IEA:InterPro.
DR   GO; GO:0019031; C:viral envelope; IEA:InterPro.
DR   GO; GO:0005524; F:ATP binding; IEA:InterPro.
DR   GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW.
DR   GO; GO:0004386; F:helicase activity; IEA:InterPro.
DR   GO; GO:0003723; F:RNA binding; IEA:InterPro.
DR   GO; GO:0003968; F:RNA-directed RNA polymerase activity; IEA:InterPro.
DR   GO; GO:0004252; F:serine-type endopeptidase activity; IEA:UniProtKB-KW.
DR   GO; GO:0005198; F:structural molecule activity; IEA:InterPro.
DR   GO; GO:0008270; F:zinc ion binding; IEA:UniProtKB-KW.
DR   GO; GO:0030683; P:evasion of host immune response by virus; IEA:UniProtKB-KW.
DR   GO; GO:0006508; P:proteolysis; IEA:InterPro.
DR   GO; GO:0006355; P:regulation of transcription, DNA-dependent; IEA:UniProtKB-KW.
DR   GO; GO:0006410; P:transcription, RNA-dependent; IEA:UniProtKB-KW.
DR   GO; GO:0019087; P:transformation of host cell by virus; IEA:InterPro.
DR   GO; GO:0019079; P:viral genome replication; IEA:InterPro.
DR   InterPro; IPR014001; DEAD-like_N.
DR   InterPro; IPR002522; HCV_capsid.
DR   InterPro; IPR002521; HCV_core.
DR   InterPro; IPR002519; HCV_env.
DR   InterPro; IPR002531; HCV_NS1.
DR   InterPro; IPR000745; HCV_NS4a.
DR   InterPro; IPR001490; HCV_NS4b.
DR   InterPro; IPR002868; HCV_NS5a.
DR   InterPro; IPR013193; HCV_NS5a_1b.
DR   InterPro; IPR013192; HCV_NS5a_Zn_bd.
DR   InterPro; IPR002166; HCV_RdRP.
DR   InterPro; IPR014021; Helic_SF1/SF2_ATP_bd.
DR   InterPro; IPR001650; Helicase_C.
DR   InterPro; IPR002518; Pept_C18_HCV_NS2.
DR   InterPro; IPR009003; Pept_Ser_Cys.
DR   InterPro; IPR004109; Peptidase_S29.
DR   InterPro; IPR007094; RNA_pol_PSvir.
DR   Pfam; PF01543; HCV_capsid; 1.
DR   Pfam; PF01542; HCV_core; 1.
DR   Pfam; PF01539; HCV_env; 1.
DR   Pfam; PF01560; HCV_NS1; 1.
DR   Pfam; PF01538; HCV_NS2; 1.
DR   Pfam; PF01006; HCV_NS4a; 1.
DR   Pfam; PF01001; HCV_NS4b; 1.
DR   Pfam; PF01506; HCV_NS5a; 1.
DR   Pfam; PF08300; HCV_NS5a_1a; 1.
DR   Pfam; PF08301; HCV_NS5a_1b; 1.
DR   Pfam; PF00271; Helicase_C; 1.
DR   Pfam; PF02907; Peptidase_S29; 1.
DR   Pfam; PF00998; RdRP_3; 1.
DR   SMART; SM00487; DEXDc; 1.
DR   PROSITE; PS51192; HELICASE_ATP_BIND_1; 1.
DR   PROSITE; PS51194; HELICASE_CTER; 1.
DR   PROSITE; PS50507; RDRP_SSRNA_POS; 1.
KW   ATP-binding; Capsid protein; Endoplasmic reticulum; Envelope protein;
KW   Helicase; Host-virus interaction; Hydrolase;
KW   Interferon antiviral system evasion; Nucleotide-binding;
KW   Nucleotidyltransferase; Oncogene; Protease; RNA replication;
KW   RNA-binding; RNA-directed RNA polymerase; Ribonucleoprotein;
KW   Serine protease; Thiol protease; Transcription;
KW   Transcription regulation; Transferase; Transmembrane;
KW   Viral nucleoprotein; Virion protein; Zinc.
SQ   SEQUENCE   3010 AA;  327156 MW;  7A9C7B1273266FF3 CRC64;
     MSTNPKPQRK TKRNTNRRPQ DVKFPGGGQI VGGVYLLPRR GPRLGVRATR KTSERSQPRG
     RRQPIPKARQ PEGRAWAQPG YPWPLYGNEG LGWAGWLLSP RGSRPSWGPT DPRRRSRNLG
     KVIDTLTCGF ADLMGYIPLV GGPLGGAARA LAHGVRVLED GVNYATGNLP GCSFSIFLLA
     LLSCLTIPAS AYEVRNVSGV YHVTNDCSNS SIVYEAADMI MHTPGCVPCV RENDCSRCWV
     ALTPTLAARN SSIPTTTIRR HVDLLVGAAA FCSAMYVGDL CGSIFLVSQL FTFSPRRYWT
     VQDCNCSIYP GHVSGHRMAW DMMMNWSPTT ALVVSQLLRI PQSVVDMVAG AHWGVLAGLA
     YYSMVGNWAK VLIVMLLFAG VDGVTHMTGG QVSHNTRSFM SLFTCGPSQK IQLINTNGSW
     HINRTALNCN DSLQTGFLAA LFYAHNLNSS GCPERMASCR SIDKFAQGWG PITHVVPDTW
     DQRPYCWHYA PKPCGIVPAS QVCGPVYCFT PSPVVVGTTD RFGVPTYTWG ENETDVLFLN
     NTRPPQGNWF GCTWMNSTGF TKTCGGPPCN IGGVGNNTLI CPTDCFRKHP EATYTKCGSG
     PWLTPRCMVD YPYRLWHYPC TVNFTIFKVR MYVGGVEHRL NAACNWTRGE RCDLEDRDRS
     ELSPLLLSTT EWQVLPCSFT PLPALSTGLI HLHQNIVDVQ YLYGIGSAVV SVVIKWEYVL
     LLFLLLADAR VCSCLWMMLL IAQAEAALEN LVVLNAASVA GAHGTLSFLV FFCAAWYIKG
     KLVPGAAYAL YGVWPLLLLL LALPHRAYAM DPEMAASCGG AVFVGLALLT LSPHYKAFLA
     RLIWWLQYFI TRAEAHLQVW IPPLNVRGGR DAIILLACAV HPELIFDITK ILLAILGPLM
     VLQAGLTRVP YFVRAQGLIR VCMLVRKVAG GHYFQMALMK LAALTGTYVY DHLTPLRDWA
     HVGLRDLAVA VEPVVFSDME TKIITWGADT AACGDIISGL PVSARRGREI LLGPADSFRE
     QGWRLLAPIT AYSQQTRGLL GCIITSLTGR DKNQVEGEVQ VVSTATQSFL ATCVNGVCWT
     VYHGAGSKTL AGPKGPITQM YTNVDQDLVG WQAPPGARSL TPCTCGSSDL YLVTRHADVI
     PVRRRGDSRG SLLSPRPVSY LKGSSGGPLL CPSGHAVGIF RAAVCTRGVA KAVDFIPVES
     METTMRSPVF TDNSSPPAVP QTFQVAHLHA PTGSGKSTKV PAAYAAQGYK VLVLNPSVAA
     TLSFGAYMSK AHGVDPNIRT GVRTITTGAP ITYSTYGKFL ADGGCSGGAY DIIICDECHS
     TDSTSILGIG TVLDQAETAG ARLVLLATAT PPGSVTVPHP NIEEVALSNT GEIPFYGKAI
     PIETIKGGRH LIFCHSKKKC DELAAKLSAL GVNAVAYYRG LDVSVIPTSG NVVVVATDAL
     MTGYTGDFDS VIDCNTCVTQ TVDFSLDPTF TIETTTVPQD AVSRSQRRGR TGRGRAGIYR
     FVTPGERPSG MFDSSVLCEC YDAGCAWYEL TPAETSVRLR AYLNTPGLPV CQDHLEFWES
     VFTGLTHIDA HFLSQTKQAG DNFPYLVAYQ ATVCARAQAP PPSWDQMWKC LIRLKPTLHG
     PTPLLYRLGA VQNEVTLTHP ITKFIMACMS ADLEVVTSTW VLVGGVLAAL AAYCLTTGSV
     VIVGRIILSG KPAVIPDREV LYREFDEMEE CASHLPYIEQ GMQLAEQFKQ KALGLLQTAT
     KQAEAAAPVV ESKWRALETF WAKHMWNFIS GIQYLAGLST LPGNPAIASL MAFTASITSP
     LTTQHTLLFN ILGGWVAAQL APPSAASAFV GAGIAGAAVG SIGLGKVLVD ILAGYGAGVA
     GALVAFKVMS GEMPSAEDMV NLLPAILSPG ALVVGVVCAA ILRRHVGPGE GAVQWMNRLI
     AFASRGNHVS PTHYVPESDA AARVTQILSS LTITQLLKRL HQWINEDCST PCSGSWLRDV
     WDWICTVLTD FKTWLQSKLL PRLPGVPFFS CQRGYKGVWL GDGVMQTTCP CGAQISGHVK
     NGSMKIVGPK TCSNTWHGTF PINAYTTGPC TPSPAPNYSR ALWRVAAEEY VEVTRVGEFH
     YVTGMTTDNV KCPCQVPSPE FFTEVDGVRL HRYAPACKPL LRDEVTFQVG LNQYPVGSQL
     PCEPEPDVAV LTSMLTDPSH ITAETARRRL ARGSPPSLAS SSASQLSAPS LKATCTTCHD
     SPDADLIEAN LLWRQEMGGN ITRVESENKV VILDSFDPLR AEEDEREVSV AAEILRKTRK
     FPPAIPIWAR PDYNPPLLES WKDPDYVPPV VHGCPLPPTK APPIPPPRRK RTVILTESTV
     SSALAELATK TFGSSGSSAV DSGTATAPPD QPSDDGDTGS DVGSYSSMPP LEGEPGDPDL
     SDGSWSTVSE EAGEDVVCCS MSYTWTGALI TPCGAEETKL PINALSNSLL RHHNMVYATT
     SRSASQRQRK VTFDRLQVLD DHYRDVLKEM KAKASTVKAK LLSVEEACKL TPPHSARSKF
     GYGAKDVRNL SSKAVNHIRS VWKDLLEDTE TPINTTIMAK SEVFCVQPEK GGRKPARLIV
     FPDLGVRVCE KMALYDVVST LPQAVMGSSY GFQYSPGQRV EFLVNAWKSK KSPMGFAYDT
     RCFDSTVTES DIRVEESIYQ CCDLAPEARQ VIRSLTERLY IGGPLTNSKG QNCGYRRCRA
     SGVLTTSCGN TLTCYLKASA ACRAAKLQDC TMLVCGDDLV VICESAGTQE DAANLRVFTE
     AMTRYSAPPG DPPRPEYDLE LITSCSSNVS VAHDAAGKRV YYLTRDPITP LARAAWETAR
     HTPVNSWLGN IIMYAPTLWA RMILMTHFFS ILLAQEQLEK ALDCQIYGAV YSIEPLDLPQ
     IIQRLHGLSA FSLHSYSPGE INRVASCLRK LGVPPLRVWR HRARSVRAKL LSQGGRAATC
     GKYLFNWAVR TKLKLTPIPA ASQLDLSGWF VAGYSGGDIY HSLSRARPRW FMWCLLLLSV
     GVGIYLLPNR
//