ID POLG_DEN4T Reviewed; 3387 AA. AC Q2YHF0; DT 12-DEC-2006, integrated into UniProtKB/Swiss-Prot. DT 20-DEC-2005, sequence version 1. DT 02-SEP-2008, entry version 31. DE RecName: Full=Genome polyprotein; DE Contains: DE RecName: Full=Protein C; DE AltName: Full=Core protein; DE AltName: Full=Capsid protein; DE Contains: DE RecName: Full=prM; DE Contains: DE RecName: Full=Peptide pr; DE Contains: DE RecName: Full=Small envelope protein M; DE AltName: Full=Matrix protein; DE Contains: DE RecName: Full=Envelope protein E; DE Contains: DE RecName: Full=Non-structural protein 1; DE Short=NS1; DE Contains: DE RecName: Full=Non-structural protein 2A; DE Short=NS2A; DE Contains: DE RecName: Full=Non-structural protein 2A-alpha; DE Contains: DE RecName: Full=Serine protease subunit NS2B; DE AltName: Full=Non-structural protein 2B; DE Contains: DE RecName: Full=Serine protease subunit NS3; DE EC=3.4.21.91; DE AltName: Full=Non-structural protein 3; DE Contains: DE RecName: Full=Non-structural protein 4A; DE Short=NS4A; DE Contains: DE RecName: Full=Peptide 2k; DE Contains: DE RecName: Full=Non-structural protein 4B; DE Short=NS4B; DE Contains: DE RecName: Full=RNA-directed RNA polymerase NS5; DE EC=2.7.7.48; DE EC=2.1.1.56; DE AltName: Full=Non-structural protein 5; OS Dengue virus type 4 (strain Thailand/0348/1991) (DENV-4). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Flavivirus; Dengue virus group. OX NCBI_TaxID=408688; OH NCBI_TaxID=7159; Aedes aegypti (Yellowfever mosquito). OH NCBI_TaxID=7160; Aedes albopictus (Forest day mosquito). OH NCBI_TaxID=188700; Aedes polynesiensis. OH NCBI_TaxID=9606; Homo sapiens (Human). RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX PubMed=15476884; DOI=10.1016/j.virol.2004.08.003; RA Klungthong C., Zhang C., Mammen M.P. Jr., Ubol S., Holmes E.C.; RT "The molecular epidemiology of dengue virus serotype 4 in Bangkok, RT Thailand."; RL Virology 329:168-179(2004). CC -!- FUNCTION: Protein C packages viral RNA to form a viral CC nucleocapsid, and promotes virion budding (By similarity). CC -!- FUNCTION: prM acts as a chaperone for envelope protein E during CC intracellular virion assembly by masking and inactivating envelope CC protein E fusion peptide. prM is matured in the last step of CC virion assembly, presumably to avoid catastrophic activation of CC the viral fusion peptide induced by the acidic pH of the trans- CC Golgi network. After cleavage by host furin, the pr peptide is CC released in the extracellular medium and small envelope protein M CC and envelope protein E homodimers are dissociated (By similarity). CC -!- FUNCTION: Envelope protein E binds cell surface receptor and is CC involved in membrane fusion between virion and target cell. CC Synthesized as an homodimer with prM which acts as a chaperone for CC envelope protein E. After cleavage of prM, envelope protein E CC dissociate from small envelope protein M and homodimerizes (By CC similarity). CC -!- FUNCTION: Non-structural protein 1 is slowly secreted from CC mammalian cells, but not from mosquito cells. Secreted form CC elicits protective immune response and plays an essential role in CC RNA replication. Soluble and membrane-associated NS1 may activate CC human complement and induce host vascular leakage. This effect CC might explain the clinical manifestations of dengue hemorrhagic CC fever and dengue shock syndrome (By similarity). CC -!- FUNCTION: Non-structural protein 2B is a required cofactor for the CC serine protease function of NS3 (By similarity). CC -!- FUNCTION: Serine protease NS3 displays three enzymatic activities: CC serine protease, NTPase and RNA helicase. NS3 serine protease, in CC association with NS2B, cleaves the polyprotein at dibasic sites in CC the cytoplasm: C-prM, NS2A-NS2B, NS2B-NS3, NS3-NS4A, NS4A-2K and CC NS4B-NS5. NS3 RNA helicase binds RNA and unwinds dsRNA in the 3' CC to 5' direction (By similarity). CC -!- FUNCTION: Non-structural protein 4A plays a role in RNA CC replication. Enhances inhibition of cell antiviral response by CC non-structural protein 4B (By similarity). CC -!- FUNCTION: Non-structural protein 4B prevent the establishment of CC cellular antiviral state by blocking the interferon-alpha/beta CC (IFN-alpha/beta) and IFN-gamma signaling pathways (By similarity). CC -!- FUNCTION: RNA-directed RNA polymerase NS5 replicates the viral (+) CC and (-) genome, and assure the capping of genomes in the CC cytoplasm. May be involved in methylation of 5'RNA cap structure CC (By similarity). CC -!- CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds CC in which each of the Xaa can be either Arg or Lys and Yaa can be CC either Ser or Ala. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- CATALYTIC ACTIVITY: S-adenosyl-L-methionine + G(5')pppR-RNA = S- CC adenosyl-L-homocysteine + m(7)G(5')pppR-RNA. CC -!- SUBUNIT: prM and envelope protein E form heterodimers in the CC endoplasmic reticulum and Golgi. Envelope protein E forms CC homodimers. NS1 forms homodimers as well as homohexamers when CC secreted. NS1 may interact with NS4A. NS3 and NS2B form an CC heterodimer. NS3 interacts with unphosphorylated NS5 (By CC similarity). CC -!- SUBCELLULAR LOCATION: Note=The virion is assembled in the CC endoplasmic reticulum lumen, transported by vesicles to the Golgi, CC then transported again to the cell membrane where it is released CC outside the cell. CC -!- SUBCELLULAR LOCATION: Protein C: Virion (By similarity). CC -!- SUBCELLULAR LOCATION: Peptide pr: Secreted (By similarity). CC -!- SUBCELLULAR LOCATION: Small envelope protein M: Virion membrane; CC Single-pass type I membrane protein (By similarity). CC -!- SUBCELLULAR LOCATION: Envelope protein E: Virion membrane; Single- CC pass type I membrane protein (By similarity). CC -!- SUBCELLULAR LOCATION: Non-structural protein 1: Secreted. CC Endoplasmic reticulum membrane; Peripheral membrane protein; CC Lumenal side (By similarity). CC -!- SUBCELLULAR LOCATION: Non-structural protein 2A-alpha: Endoplasmic CC reticulum membrane (By similarity). CC -!- SUBCELLULAR LOCATION: Non-structural protein 2A: Endoplasmic CC reticulum membrane (By similarity). CC -!- SUBCELLULAR LOCATION: Serine protease subunit NS2B: Endoplasmic CC reticulum membrane; Peripheral membrane protein; Cytoplasmic side CC (By similarity). CC -!- SUBCELLULAR LOCATION: Serine protease subunit NS3: Endoplasmic CC reticulum membrane; Peripheral membrane protein; Cytoplasmic side CC (By similarity). CC -!- SUBCELLULAR LOCATION: Non-structural protein 4A: Endoplasmic CC reticulum membrane; Peripheral membrane protein; Cytoplasmic side CC (By similarity). CC -!- SUBCELLULAR LOCATION: Non-structural protein 4B: Endoplasmic CC reticulum membrane; Multi-pass membrane protein (By similarity). CC Note=The C-terminal transmembrane domain of non-structural protein CC 4B is presumably reoriented after cleavage on the lumenal side (By CC similarity). CC -!- SUBCELLULAR LOCATION: RNA-directed RNA polymerase NS5: Endoplasmic CC reticulum membrane; Peripheral membrane protein; Cytoplasmic side. CC Nucleus (By similarity). CC -!- DOMAIN: Transmembrane domains of the small envelope protein M and CC envelope protein E contains an endoplasmic reticulum retention CC signals (By similarity). CC -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins. CC The nascent protein C contains a C-terminal hydrophobic domain CC that act as a signal sequence for translocation of prM into the CC lumen of the ER. Mature protein C is cleaved at a site upstream of CC this hydrophobic domain by NS3. prM is cleaved in post-Golgi CC vesicles by a host furin, releasing the mature small envelope CC protein M, and peptide pr. Non-structural protein 2A-alpha, a C- CC terminally truncated form of non-structural protein 2A, results CC from partial cleavage by NS3 (By similarity). CC -!- PTM: RNA-directed RNA polymerase NS5 is phosphorylated on serines CC residues. This phosphorylation may trigger NS5 nuclear CC localization (By similarity). CC -!- PTM: Envelope protein E and non-structural protein 1 are N- CC glycosylated (By similarity). CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase S7 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; AY618990; AAU89377.1; -; Genomic_RNA. DR SMR; Q2YHF0; 1647-2092, 2494-2755, 2761-3371. DR MEROPS; S07.001; -. DR GO; GO:0005789; C:endoplasmic reticulum membrane; IEA:UniProtKB-SubCell. DR InterPro; IPR014001; DEAD-like_N. DR InterPro; IPR011492; DEAD_Flavivir. DR InterPro; IPR001650; DNA/RNA_helicase_C. DR InterPro; IPR011999; Flav_glyE_cen_dm. DR InterPro; IPR013754; Flav_glyE_dim. DR InterPro; IPR001122; Flavi_capsidC. DR InterPro; IPR000069; Flavi_M. DR InterPro; IPR001157; Flavi_NS1. DR InterPro; IPR000752; Flavi_NS2A. DR InterPro; IPR000487; Flavi_NS2B. DR InterPro; IPR000404; Flavi_NS4A. DR InterPro; IPR001528; Flavi_NS4B. DR InterPro; IPR002535; Flavi_propep. DR InterPro; IPR000336; Flv_glyE_Ig-like. DR InterPro; IPR014412; Gen_Poly_FLV. DR InterPro; IPR014021; Helicase_SF1/SF2_ATP-bd. DR InterPro; IPR001850; Peptidase_S7. DR InterPro; IPR000208; RNA_pol_flaviviral. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR002877; RrmJFtsJ_MeTrfase. DR Gene3D; G3DSA:2.60.98.10; Flav_glyE_dim; 1. DR Gene3D; G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1. DR Pfam; PF01003; Flavi_capsid; 1. DR Pfam; PF07652; Flavi_DEAD; 1. DR Pfam; PF02832; Flavi_glycop_C; 1. DR Pfam; PF00869; Flavi_glycoprot; 1. DR Pfam; PF01004; Flavi_M; 1. DR Pfam; PF00948; Flavi_NS1; 1. DR Pfam; PF01005; Flavi_NS2A; 1. DR Pfam; PF01002; Flavi_NS2B; 1. DR Pfam; PF01350; Flavi_NS4A; 1. DR Pfam; PF01349; Flavi_NS4B; 1. DR Pfam; PF00972; Flavi_NS5; 1. DR Pfam; PF01570; Flavi_propep; 1. DR Pfam; PF01728; FtsJ; 1. DR Pfam; PF00949; Peptidase_S7; 1. DR PIRSF; PIRSF003817; Gen_Poly_FLV; 1. DR ProDom; PD001496; Flavi_NS1; 1. DR SMART; SM00487; DEXDc; 1. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; FALSE_NEG. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. PE 3: Inferred from homology; KW ATP-binding; Capsid protein; Cleavage on pair of basic residues; KW Complete proteome; Endoplasmic reticulum; Envelope protein; KW Glycoprotein; Helicase; Hydrolase; Membrane; Metal-binding; KW Multifunctional enzyme; Nucleotide-binding; Nucleotidyltransferase; KW Nucleus; Phosphoprotein; Protease; Ribonucleoprotein; RNA replication; KW RNA-binding; RNA-directed RNA polymerase; Secreted; Serine protease; KW Transcription; Transcription regulation; Transferase; Transmembrane; KW Viral nucleoprotein; Virion. FT CHAIN 1 99 Protein C. FT /FTId=PRO_0000268131. FT PROPEP 100 113 ER anchor for the protein C, removed in FT mature form by serine protease NS3 (By FT similarity). FT /FTId=PRO_0000268132. FT CHAIN 114 279 prM. FT /FTId=PRO_0000268133. FT CHAIN 114 204 Peptide pr. FT /FTId=PRO_0000268134. FT CHAIN 205 279 Small envelope protein M. FT /FTId=PRO_0000268135. FT CHAIN 280 774 Envelope protein E. FT /FTId=PRO_0000268136. FT CHAIN 775 1126 Non-structural protein 1. FT /FTId=PRO_0000268137. FT CHAIN 1127 1344 Non-structural protein 2A. FT /FTId=PRO_0000268138. FT CHAIN 1127 ? Non-structural protein 2A-alpha. FT /FTId=PRO_0000304550. FT CHAIN 1345 1474 Serine protease subunit NS2B. FT /FTId=PRO_0000268139. FT CHAIN 1475 2092 Serine protease subunit NS3. FT /FTId=PRO_0000268140. FT CHAIN 2093 2219 Non-structural protein 4A. FT /FTId=PRO_0000268141. FT PEPTIDE 2220 2242 Peptide 2k. FT /FTId=PRO_0000268142. FT CHAIN 2243 2487 Non-structural protein 4B. FT /FTId=PRO_0000268143. FT CHAIN 2488 3387 RNA-directed RNA polymerase NS5. FT /FTId=PRO_0000268144. FT TOPO_DOM 1 100 Cytoplasmic (Potential). FT TRANSMEM 101 121 Potential. FT TOPO_DOM 122 237 Extracellular (Potential). FT TRANSMEM 238 258 Potential. FT TOPO_DOM 259 264 Cytoplasmic (Potential). FT TRANSMEM 265 285 Potential. FT TOPO_DOM 286 724 Extracellular (Potential). FT TRANSMEM 725 745 Potential. FT TOPO_DOM 746 751 Cytoplasmic (Potential). FT TRANSMEM 752 772 Potential. FT TOPO_DOM 773 1155 Extracellular (Potential). FT TRANSMEM 1156 1176 Potential. FT TOPO_DOM 1177 1446 Cytoplasmic (Potential). FT TRANSMEM 1447 1467 Potential. FT TOPO_DOM 1468 2191 Lumenal (Potential). FT TRANSMEM 2192 2212 Potential. FT TOPO_DOM 2213 2219 Cytoplasmic (Potential). FT TRANSMEM 2220 2239 Potential. FT TOPO_DOM 2240 2343 Lumenal (Potential). FT TRANSMEM 2344 2364 Potential. FT TOPO_DOM 2365 2409 Cytoplasmic (Potential). FT TRANSMEM 2410 2430 Potential. FT TOPO_DOM 2431 2455 Lumenal (Potential). FT TRANSMEM 2456 2476 Potential. FT TOPO_DOM 2477 3387 Cytoplasmic (Potential). FT DOMAIN 1654 1810 Helicase ATP-binding. FT DOMAIN 1820 1987 Helicase C-terminal. FT DOMAIN 3016 3166 RdRp catalytic. FT NP_BIND 1667 1674 ATP (Potential). FT MOTIF 1758 1761 DEAH box (By similarity). FT ACT_SITE 1525 1525 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1549 1549 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1609 1609 Charge relay system; for serine protease FT NS3 activity (By similarity). FT SITE 99 100 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 113 114 Cleavage; by host signal peptidase (By FT similarity). FT SITE 204 205 Cleavage; by host furin (By similarity). FT SITE 279 280 Cleavage; by host signal peptidase (By FT similarity). FT SITE 774 775 Cleavage; by host signal peptidase (By FT similarity). FT SITE 1126 1127 Cleavage; by host (By similarity). FT SITE 1344 1345 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 1474 1475 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 2092 2093 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 2219 2220 Cleavage; by host signal peptidase (By FT similarity). FT SITE 2242 2243 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 2487 2488 Cleavage; by serine protease NS3 (By FT similarity). FT CARBOHYD 182 182 N-linked (GlcNAc...) (Potential). FT CARBOHYD 346 346 N-linked (GlcNAc...) (Potential). FT CARBOHYD 432 432 N-linked (GlcNAc...) (Potential). FT CARBOHYD 981 981 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2297 2297 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2301 2301 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2453 2453 N-linked (GlcNAc...) (Potential). FT DISULFID 282 309 By similarity. FT DISULFID 339 400 By similarity. FT DISULFID 353 384 By similarity. FT DISULFID 371 395 By similarity. FT DISULFID 464 564 By similarity. FT DISULFID 581 612 By similarity. SQ SEQUENCE 3387 AA; 378438 MW; 1FEDC2D663A0F945 CRC64; MNQRKKVARP PFNMLKRERN RVSTPQGLVK RFSTGLFSGK GPLRMVLAFI TFLRVLSIPP TAGILKRWGQ LKKNKAIKIL TGFRKEIGRM LNILNGRKRS TITLLCLIPT VMAFHLSTRD GEPLMIVAKH ERGRPLLFKT TEGINKCTLI AMDLGEMCED TVTYKCPLLV NTEPEDIDCW CNLTSAWVMY GTCTQSGERR REKRSVALTP HSGMGLETRA ETWMSSEGAW KHAQRVESWI LRNPGFALLA GFMAYMIGQT GIQRTVFFIL MMLVAPSYGM RCVGVGNRDF VEGVSGGAWV DLVLEHGGCV TTMAQGKPTL DFELIKTTAK EVALLRTYCI EASISNITTA TRCPTQGEPY LKEEQDQQYI CRRDMVDRGW GNGCGLFGKG GVVTCAKFSC SGKITGNLVQ IENLEYTVVV TVHNGDTHAV GNDTSNHGVT ATITPRSPSV EVKLPDYGEL TLDCEPRSGI DFNEMILMKM KTKTWLVHKQ WFLDLPLPWT AGADTLEVHW NHKERMVTFK VPHAKRQDVT VLGSQEGAMH SALAGATEVD SGDGNHMFAG HLKCKVRMEK LRIKGMSYTM CSGKFSIDKE MAETQHGTTV VKVKYEGTGA PCKVPIEIRD VNKEKVVGRI ISSTPFAENT NSVTNIELEP PFGDSYIVIG VGDSALTLHW FRKGSSIGKM FESTYRGAKR MAILGETAWD FGSVGGLLTS LGKAVHQVFG SVYTTMFGGV SWMVRILIGL LVLWIGTNSR NTSMAMSCIA VGGITLFLGF TVHADMGCAV SWSGKELKCG SGIFVIDNVH TWTEQYKFQP ESPARLASAI LNAHKDGVCG IRSTTRLENV MWKQITNELN YVLWEGGHDL TVVAGDVKGV LSKGKRALAP PVNDLKYSWK TWGKAKIFTP ETRNSTFLVD GPDTSECPNE RRAWNFLEVE DYGFGMFTTN IWMKFREGSS EVCDHRLMSA AIKDQKAVHA DMGYWIESSK NQTWQIEKAS LIEVKTCLWP KTHTLWSNGV LESQMLIPKA YAGPISQHNY RQGYATQTVG PWHLGKLEID FGECPGTTVT IQEDCDHRGP SLRTTTASGK LVTQWCCRSC TMPPLRFLGE DGCWYGMEIR PLNEKEENMV KSQVSAGQGT SETFSMGLLC LTLFVEECLR RRVTRKHMIL VVVTTLCAII LGGLTWMDLL RALIMLGDTM SGRMGGQIHL AIMAVFKMSP GYVLGIFLRK LTSRETALMV IGMAMTTVLS IPHDLMEFID GISLGLILLK MVTHFDNTQV GTLALSLTFI KSTMPLVMAW RTIMAVLFVV TLIPLCRTSC LQKQSHWVEI TALILGAQAL PVYLMTLMKG ASKRSWPLNE GIMAVGLVSL LGSALLKNDV PLAGPMVAGG LLLAAYVMSG SSADLSLEKA ANVQWDEMAD ITGSSPIIEV KQDEDGSFSI RDVEETNMIT LLVKLALITV SGLYPLAIPV TMTLWYMWQV KTQRSGALWD VPSPAAAQKA TLTEGVYRIM QRGLFGKTQV GVGIHMEGVF HTMWHVTRGS VICHESGRLE PSWADVRNDM ISYGGGWRLG DKWDKEEDVQ VLAIEPGKNP KHVQTKPGLF KTLTGEIGAV TLDFKPGTSG SPIINRKGKV IGLYGNGVVT KSGDYVSAIT QAERIGEPDY EVDEDIFRKK RLTIMDLHPG AGKTKRILPS IVREALKRRL RTLILAPTRV VAAEMEEALR GLPIRYQTPA VKSEHTGREI VDLMCHATFT TRLLSSTRVP NYNLIVMDEA HFTDPSSVAA RGYISTRVEM GEAAAIFMTA TPPGTTDPFP QSNSPIEDIE REIPERSWNT GFDWITDYQG KTVWFVPSIK AGNDIANCLR KSGKKVIQLS RKTFDTEYPK TKLTDWDFVV TTDISEMGAN FRAGRVIDPR RCLKPVILTD GPERVILAGP IPVTPASAAQ RRGRIGRNPA QEDDQYVFSG DPLRNDEDHA HWTEAKMLLD NIYTPEGIIP TLFGPEREKT QAIDGEFRLR GEQRKTFVEL MRRGDLPVWL SYKVASAGIS YKDREWCFTG ERNNQILEEN MEVEIWTREG EKKKLRPKWL DARVYADPMA LKDFKEFASG RKSITLDILT EIASLPTYLS SRAKLALDNI VMLHTTERGG KAYQHALNEL PESLETLMLV ALLGAMTAGI FLFFMQGKGI GKLSMGLIAI AVASGLLWVA EIQPQWIAAS IILEFFLMVL LVPEPEKQRT PQDNQLIYVI LTILTIIALV AANEMGLIEK TKTDFGFYQA KTETTILDVD LRPASAWTLY AVATTILTPM LRHTIENTSA NLSLAAIANQ AAVLMGLGKG WPLHRMDLGV PLLAMGCYSQ VNPTTLTASL VMLLVHYAII GPGLQAKATR EAQKRTAAGI MKNPTVDGIT VIDLEPISYD PKFEKQLGQV MLLVLCAGQL LLMRTTWAFC EVLTLATGPI LTLWEGNPGR FWNTTIAVST ANIFRGSYLA GAGLAFSLIK NAQTPRRGTG TTGETLGEKW KRQLNSLDRK EFEEYKRSGI LEVDRTEAKS ALKDGSKIKY AVSRGTSKIR WIVERGMVKP KGKVVDLGCG RGGWSYYMAT LKNVTEVKGY TKGGPGHEEP IPMATYGWNL VKLHSGVDVF YKPTEQVDTL LCDIGESSSN PTIEEGRTLR VLKMVEPWLS SKPEFCIKVL NPYMPTVIEE LEKLQRKHGG SLVRCPLSRN STHEMYWVSG VSGNIVSSVN TTSKMLLNRF TTRHRKPTYE KDADLGAGTR SVSTETEKPD MTIIGRRLQR LQEEHKETWH YDHENPYRTW AYHGSYEAPS TGSASSMVNG VVKLLTKPWD VVPMVTQLAM TDTTPFGQQR VFKEKVDTRT PQPKPGTRVV MTTTANWLWA LLGRKKNPRL CTREEFISKV RSNAAIGAVF QEEQGWTSAS EAVNDSRFWE LVDKERALHQ EGKCESCVYN MMGKREKKLG EFGRAKGSRA IWYMWLGARF LEFEALGFLN EDHWFGRENS WSGVEGEGLH RLGYILEDID KKDGDLIYAD DTAGWDTRIT EDDLLNEELI TEQMAPHHKI LAKAIFKLTY QNKVVKVLRP TPKGAVMDII SRKDQRGSGQ VGTYGLNTFT NMEVQLIRQM EAEGVITRDD MHNPKGLKER VEKWLKECGV DRLKRMAISG DDCVVKPLDE RFSTSLLFLN DMGKVRKDIP QWEPSKGWKN WQEVPFCSHH FHKIFMKDGR SLVVPCRNQD ELIGRARISQ GAGWSLRETA CLGKAYAQMW SLMYFHRRDL RLASMAICSA VPTEWFPTSR TTWSIHAHHQ WMTTEDMLKV WNRVWIEDNP NMIDKTPVHS WEDIPYLGKR EDLWCGSLIG LSSRATWAKN IQTAITQVRN LIGKEEYVDY MPVMKRYSAH FESEGVL //