ID POLG_BVDVN STANDARD; PRT; 3988 AA. AC P19711; DT 01-FEB-1991, integrated into UniProtKB/Swiss-Prot. DT 01-FEB-1996, sequence version 2. DT 02-MAY-2006, entry version 59. DE Genome polyprotein [Contains: N-terminal protease (EC 3.4.22.-) (N- DE pro) (Autoprotease p20); Capsid protein C; E(rns) glycoprotein DE (gp44/48); Envelope glycoprotein E1 (gp33); Envelope glycoprotein E2 DE (gp55); p7; Nonstructural protein 2-3; Nonstructural protein 2 (NS2); DE Protease/helicase NS3 (EC 3.4.21.-) (NTPase); Nonstructural protein 4A DE (NS4A); Nonstructural protein 4B (NS4B); Nonstructural protein 5A DE (NS5A); RNA-directed RNA polymerase (EC 2.7.7.48) (NS5B)]. OS Bovine viral diarrhea virus (isolate NADL) (BVDV) (Mucosal disease OS virus). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Pestivirus. OX NCBI_TaxID=11100; RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=88265858; PubMed=2838957; RA Collett M.S., Larson R., Gold C., Strick D., Anderson D.K., RA Purchio A.F.; RT "Molecular cloning and nucleotide sequence of the pestivirus bovine RT viral diarrhea virus."; RL Virology 165:191-199(1988). RN [2] RP GENOMIC ORGANIZATION. RX MEDLINE=88265859; PubMed=2838958; RA Collett M.S., Larson R., Belzer S.K., Retzel E.; RT "Proteins encoded by bovine viral diarrhea virus: the genomic RT organization of a pestivirus."; RL Virology 165:200-208(1988). RN [3] RP PROCESSING OF POLYPROTEIN, AND PROTEIN SEQUENCE OF 2363-2376; RP 2427-2441; 2774-2788 AND 3270-3284. RX MEDLINE=97332366; PubMed=9188600; RA Xu J., Mendez E., Caron P.R., Lin C., Murcko M.A., Collett M.S., RA Rice C.M.; RT "Bovine viral diarrhea virus NS3 serine proteinase: polyprotein RT cleavage sites, cofactor requirements, and molecular model of an RT enzyme essential for pestivirus replication."; RL J. Virol. 71:5312-5322(1997). RN [4] RP SUBCELLULAR LOCATION. RX MEDLINE=99281893; PubMed=10355762; RA Weiland F., Weiland E., Unger G., Saalmuller A., Thiel H.-J.; RT "Localization of pestiviral envelope proteins E(rns) and E2 at the RT cell surface and on isolated particles."; RL J. Gen. Virol. 80:1157-1165(1999). RN [5] RP ROLE OF BOVINE LOW-DENSITY-LIPOPROTEIN RECEPTOR IN VIRUS ATTACHMENT TO RP HOST CELL. RX PubMed=10535997; DOI=10.1073/pnas.96.22.12766; RA Agnello V., Abel G., Elfahal M., Knight G.B., Zhang Q.X.; RT "Hepatitis C virus and other flaviviridae viruses enter cells via low RT density lipoprotein receptor."; RL Proc. Natl. Acad. Sci. U.S.A. 96:12766-12771(1999). RN [6] RP INTERACTION OF E(RNS) WITH CELL SURFACE GLYCOSAMINOGLYCANS. RX PubMed=10644844; RA Iqbal M., Flick-Smith H., McCauley J.W.; RT "Interactions of bovine viral diarrhoea virus glycoprotein E(rns) with RT cell surface glycosaminoglycans."; RL J. Gen. Virol. 81:451-459(2000). RN [7] RP FUNCTION OF NS2-3. RC STRAIN=Isolate NADL Jiv 90(-); RX PubMed=14963137; DOI=10.1128/JVI.78.5.2414-2425.2004; RA Agapov E.V., Murray C.L., Frolov I., Qu L., Myers T.M., Rice C.M.; RT "Uncleaved NS2-3 is required for production of infectious bovine viral RT diarrhea virus."; RL J. Virol. 78:2414-2425(2004). RN [8] RP ROLE OF BOVINE CD46/MCP IN VIRUS ATTACHMENT TO HOST CELL. RX PubMed=14747544; DOI=10.1128/JVI.78.4.1792-1799.2004; RA Maurer K., Krey T., Moennig V., Thiel H.J., Rumenapf T.; RT "CD46 is a cellular receptor for bovine viral diarrhea virus."; RL J. Virol. 78:1792-1799(2004). CC -!- FUNCTION: Uncleaved NS2-3 is required for production of infectious CC virus. NS3 is a multifunctional protein with helicase, NTPase and CC protease activity. NS4A is a cofactor for the NS3 protease CC activity. CC -!- FUNCTION: P7 forms a leader sequence to properly orient NS2 in the CC membrane. CC -!- FUNCTION: E(rns), E1 and E2 are responsible of cell attachment and CC subsequent fusion of viral and cellular membrane. Binding to CC target cell involves interactions with glycosaminoglycans and CC membranes proteins such as bovine CD46/MCP and low-density- CC lipoprotein receptor. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- SUBCELLULAR LOCATION: E(rns) and E2 glycoproteins are located at CC the surface of infected cells. NS2 is a membrane protein. CC -!- PTM: The E(rns) glycoprotein is heavily glycosylated. Forms CC disulfide-linked homodimers (By similarity). CC -!- PTM: The E1 and E2 envelope glycoproteins form disulfide-linked CC homodimers as well as heterodimers (By similarity). CC -!- PTM: The viral RNA of pestiviruses is expressed as a single CC polyprotein which undergoes post-translational proteolytic CC processing resulting in the production of at least eleven CC individual proteins. The N-terminal protease cleaves itself from CC the nascent polyprotein autocatalytically and thereby generates CC the N-terminus of the adjacent viral capsid protein C (By CC similarity). CC -!- PTM: Cleavage between E2 and p7 is partial (By similarity). CC -!- PTM: Cleavage between NS2 and NS3 is partial. CC -!- MISCELLANEOUS: BVDV is divided in two types: cytopathic and CC noncytopathic. Both type of viruses can be found in animals CC suffering from mucosal disease, as a cytopathic BVDV can develop CC from a noncytopathic virus within the infected animal by CC deletions, mutations or insertions. Both types express uncleaved CC NS2-3, but cytopathic also express NS3. The cytopathic NADL strain CC contains an insertion (Jiv 90) that potentiate the partial CC cleavage of NS2-3. Removal of this insertion in the NADL Jiv 90(-) CC strain results in a noncytopathic strain in which NS2-3 remains CC uncleaved. CC -!- SIMILARITY: Belongs to the pestiviruses polyprotein family. CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase C53 domain. CC -!- SIMILARITY: Contains 1 peptidase S31 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M31182; AAA42854.1; -; Genomic_RNA. DR PIR; A29198; GNWVBV. DR PDB; 1S48; X-ray; A=3340-3948. DR PDB; 1S49; X-ray; A=3340-3948. DR PDB; 1S4F; X-ray; A/B/C/D=3348-3948. DR MEROPS; C53.001; -. DR MEROPS; S31.001; -. DR InterPro; IPR001410; DEAD. DR InterPro; IPR011545; DEAD/DEAH_N. DR InterPro; IPR002166; HCV_RdRP. DR InterPro; IPR001650; Helicase_C. DR InterPro; IPR008751; Peptidase_C53. DR InterPro; IPR000280; Peptidase_S31. DR InterPro; IPR007095; RNA_pol_DS_PS. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR001568; RNase_T2. DR Pfam; PF00271; Helicase_C; 1. DR Pfam; PF05550; Peptidase_C53; 1. DR Pfam; PF05578; Peptidase_S31; 1. DR Pfam; PF00998; RdRP_3; 1. DR PRINTS; PR00729; CDVENDOPTASE. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; 1. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. DR PROSITE; PS00531; RNASE_T2_2; 1. KW 3D-structure; ATP-binding; Direct protein sequencing; Glycoprotein; KW Helicase; Hydrolase; Membrane; Nucleotide-binding; KW Nucleotidyltransferase; Polyprotein; Protease; RNA replication; KW RNA-directed RNA polymerase; Serine protease; Thiol protease; KW Transferase; Transmembrane. FT CHAIN 1 168 N-terminal protease (By similarity). FT /FTId=PRO_0000038024. FT CHAIN 169 270 Capsid protein C (By similarity). FT /FTId=PRO_0000038025. FT CHAIN 271 497 E(rns) glycoprotein (By similarity). FT /FTId=PRO_0000038026. FT CHAIN 498 659 Envelope glycoprotein E1 (By similarity). FT /FTId=PRO_0000038027. FT CHAIN 660 1066 Envelope glycoprotein E2 (By similarity). FT /FTId=PRO_0000038028. FT CHAIN 1067 1136 p7 (By similarity). FT /FTId=PRO_0000038029. FT CHAIN 1137 2362 Nonstructural protein 2-3. FT /FTId=PRO_0000038030. FT CHAIN 1137 1679 Nonstructural protein 2. FT /FTId=PRO_0000038031. FT CHAIN 1680 2362 Protease/helicase NS3. FT /FTId=PRO_0000038032. FT CHAIN 2363 2426 Nonstructural protein 4A (By similarity). FT /FTId=PRO_0000038033. FT CHAIN 2427 2773 Nonstructural protein 4B (By similarity). FT /FTId=PRO_0000038034. FT CHAIN 2774 3269 Nonstructural protein 5A (By similarity). FT /FTId=PRO_0000038035. FT CHAIN 3270 3988 RNA-directed RNA polymerase (By FT similarity). FT /FTId=PRO_0000038036. FT TRANSMEM 1144 1164 Potential. FT TRANSMEM 1189 1209 Potential. FT TRANSMEM 1217 1237 Potential. FT TRANSMEM 1247 1267 Potential. FT TRANSMEM 1281 1301 Potential. FT TRANSMEM 1360 1380 Potential. FT TRANSMEM 1658 1678 Potential. FT DOMAIN 1 168 Peptidase C53. FT DOMAIN 1892 2050 Helicase ATP-binding. FT DOMAIN 2068 2233 Helicase C-terminal. FT DOMAIN 3608 3731 RdRp catalytic. FT ACT_SITE 22 22 N-terminal protease (By similarity). FT ACT_SITE 49 49 N-terminal protease (By similarity). FT ACT_SITE 69 69 N-terminal protease (By similarity). FT ACT_SITE 1748 1748 Charge relay system; protease NS3 (By FT similarity). FT ACT_SITE 1785 1785 Charge relay system; protease NS3 (By FT similarity). FT ACT_SITE 1842 1842 Charge relay system; protease NS3 (By FT similarity). FT SITE 168 169 Cleavage (auto-) (By similarity). FT SITE 270 271 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 497 498 Cleavage. FT SITE 659 660 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 1066 1067 Cleavage; partial (by host signal FT peptidase) (By similarity). FT SITE 1136 1137 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 1679 1680 Cleavage; partial (in cytopathic FT strains). FT SITE 2362 2363 Cleavage (by NS3) (By similarity). FT SITE 2426 2427 Cleavage (by NS3) (By similarity). FT SITE 2773 2774 Cleavage (by NS3) (By similarity). FT SITE 3269 3270 Cleavage (by NS3) (By similarity). FT CARBOHYD 272 272 N-linked (GlcNAc...) (Potential). FT CARBOHYD 281 281 N-linked (GlcNAc...) (Potential). FT CARBOHYD 296 296 N-linked (GlcNAc...) (Potential). FT CARBOHYD 335 335 N-linked (GlcNAc...) (Potential). FT CARBOHYD 365 365 N-linked (GlcNAc...) (Potential). FT CARBOHYD 370 370 N-linked (GlcNAc...) (Potential). FT CARBOHYD 413 413 N-linked (GlcNAc...) (Potential). FT CARBOHYD 487 487 N-linked (GlcNAc...) (Potential). FT CARBOHYD 597 597 N-linked (GlcNAc...) (Potential). FT CARBOHYD 809 809 N-linked (GlcNAc...) (Potential). FT CARBOHYD 878 878 N-linked (GlcNAc...) (Potential). FT CARBOHYD 922 922 N-linked (GlcNAc...) (Potential). FT CARBOHYD 990 990 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1357 1357 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1419 1419 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1451 1451 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1803 1803 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2224 2224 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2307 2307 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2584 2584 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2772 2772 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2981 2981 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3778 3778 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3867 3867 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3883 3883 N-linked (GlcNAc...) (Potential). FT HELIX 3364 3372 FT STRAND 3373 3376 FT STRAND 3379 3379 FT STRAND 3382 3384 FT STRAND 3388 3392 FT TURN 3396 3397 FT STRAND 3398 3398 FT STRAND 3402 3404 FT HELIX 3406 3414 FT TURN 3415 3416 FT HELIX 3419 3421 FT STRAND 3422 3427 FT HELIX 3431 3440 FT TURN 3441 3442 FT TURN 3451 3452 FT HELIX 3453 3462 FT TURN 3463 3464 FT HELIX 3467 3469 FT TURN 3470 3471 FT STRAND 3473 3473 FT HELIX 3478 3481 FT TURN 3482 3484 FT TURN 3487 3488 FT TURN 3493 3494 FT STRAND 3496 3496 FT TURN 3499 3500 FT HELIX 3501 3504 FT TURN 3505 3506 FT HELIX 3507 3518 FT TURN 3519 3520 FT STRAND 3527 3531 FT STRAND 3535 3537 FT HELIX 3539 3543 FT STRAND 3544 3545 FT STRAND 3549 3551 FT STRAND 3555 3558 FT STRAND 3561 3562 FT HELIX 3563 3570 FT STRAND 3571 3571 FT HELIX 3572 3575 FT TURN 3576 3577 FT STRAND 3578 3579 FT STRAND 3582 3582 FT TURN 3583 3584 FT STRAND 3585 3585 FT HELIX 3586 3588 FT STRAND 3589 3589 FT HELIX 3591 3603 FT TURN 3604 3604 FT STRAND 3605 3612 FT STRAND 3615 3615 FT TURN 3616 3617 FT HELIX 3618 3621 FT HELIX 3624 3637 FT STRAND 3638 3638 FT HELIX 3640 3652 FT TURN 3653 3654 FT STRAND 3655 3660 FT TURN 3661 3662 FT STRAND 3663 3668 FT TURN 3674 3675 FT STRAND 3676 3676 FT TURN 3677 3678 FT HELIX 3679 3699 FT HELIX 3703 3709 FT STRAND 3710 3715 FT TURN 3716 3717 FT STRAND 3718 3724 FT HELIX 3725 3741 FT TURN 3742 3743 FT STRAND 3746 3746 FT TURN 3750 3751 FT STRAND 3752 3753 FT STRAND 3756 3759 FT HELIX 3760 3762 FT STRAND 3765 3765 FT TURN 3766 3767 FT STRAND 3768 3775 FT TURN 3776 3777 FT STRAND 3778 3778 FT STRAND 3780 3785 FT HELIX 3788 3797 FT STRAND 3802 3804 FT STRAND 3808 3810 FT HELIX 3811 3822 FT TURN 3823 3824 FT HELIX 3826 3837 FT STRAND 3838 3838 FT TURN 3840 3841 FT STRAND 3842 3842 FT STRAND 3846 3854 FT HELIX 3856 3864 FT STRAND 3865 3866 FT HELIX 3868 3870 FT STRAND 3871 3874 FT HELIX 3876 3882 FT TURN 3883 3884 FT HELIX 3885 3888 FT TURN 3889 3890 FT STRAND 3892 3892 FT TURN 3894 3895 FT HELIX 3896 3908 FT STRAND 3910 3912 FT TURN 3915 3918 FT STRAND 3919 3919 FT HELIX 3920 3926 FT TURN 3927 3927 FT STRAND 3935 3939 FT STRAND 3941 3941 FT STRAND 3943 3944 SQ SEQUENCE 3988 AA; 449163 MW; 4474212F338661B8 CRC64; MELITNELLY KTYKQKPVGV EEPVYDQAGD PLFGERGAVH PQSTLKLPHK RGERDVPTNL ASLPKRGDCR SGNSRGPVSG IYLKPGPLFY QDYKGPVYHR APLELFEEGS MCETTKRIGR VTGSDGKLYH IYVCIDGCII IKSATRSYQR VFRWVHNRLD CPLWVTTCSD TKEEGATKKK TQKPDRLERG KMKIVPKESE KDSKTKPPDA TIVVEGVKYQ VRKKGKTKSK NTQDGLYHNK NKPQESRKKL EKALLAWAII AIVLFQVTMG ENITQWNLQD NGTEGIQRAM FQRGVNRSLH GIWPEKICTG VPSHLATDIE LKTIHGMMDA SEKTNYTCCR LQRHEWNKHG WCNWYNIEPW ILVMNRTQAN LTEGQPPREC AVTCRYDRAS DLNVVTQARD SPTPLTGCKK GKNFSFAGIL MRGPCNFEIA ASDVLFKEHE RISMFQDTTL YLVDGLTNSL EGARQGTAKL TTWLGKQLGI LGKKLENKSK TWFGAYAASP YCDVDRKIGY IWYTKNCTPA CLPKNTKIVG PGKFGTNAED GKILHEMGGH LSEVLLLSLV VLSDFAPETA SVMYLILHFS IPQSHVDVMD CDKTQLNLTV ELTTAEVIPG SVWNLGKYVC IRPNWWPYET TVVLAFEEVS QVVKLVLRAL RDLTRIWNAA TTTAFLVCLV KIVRGQMVQG ILWLLLITGV QGHLDCKPEF SYAIAKDERI GQLGAEGLTT TWKEYSPGMK LEDTMVIAWC EDGKLMYLQR CTRETRYLAI LHTRALPTSV VFKKLFDGRK QEDVVEMNDN FEFGLCPCDA KPIVRGKFNT TLLNGPAFQM VCPIGWTGTV SCTSFNMDTL ATTVVRTYRR SKPFPHRQGC ITQKNLGEDL HNCILGGNWT CVPGDQLLYK GGSIESCKWC GYQFKESEGL PHYPIGKCKL ENETGYRLVD STSCNREGVA IVPQGTLKCK IGKTTVQVIA MDTKLGPMPC RPYEIISSEG PVEKTACTFN YTKTLKNKYF EPRDSYFQQY MLKGEYQYWF DLEVTDHHRD YFAESILVVV VALLGGRYVL WLLVTYMVLS EQKALGIQYG SGEVVMMGNL LTHNNIEVVT YFLLLYLLLR EESVKKWVLL LYHILVVHPI KSVIVILLMI GDVVKADSGG QEYLGKIDLC FTTVVLIVIG LIIARRDPTI VPLVTIMAAL RVTELTHQPG VDIAVAVMTI TLLMVSYVTD YFRYKKWLQC ILSLVSAVFL IRSLIYLGRI EMPEVTIPNW RPLTLILLYL ISTTIVTRWK VDVAGLLLQC VPILLLVTTL WADFLTLILI LPTYELVKLY YLKTVRTDTE RSWLGGIDYT RVDSIYDVDE SGEGVYLFPS RQKAQGNFSI LLPLIKATLI SCVSSKWQLI YMSYLTLDFM YYMHRKVIEE ISGGTNIISR LVAALIELNW SMEEEESKGL KKFYLLSGRL RNLIIKHKVR NETVASWYGE EEVYGMPKIM TIIKASTLSK SRHCIICTVC EGREWKGGTC PKCGRHGKPI TCGMSLADFE ERHYKRIFIR EGNFEGMCSR CQGKHRRFEM DREPKSARYC AECNRLHPAE EGDFWAESSM LGLKITYFAL MDGKVYDITE WAGCQRVGIS PDTHRVPCHI SFGSRMPFRQ EYNGFVQYTA RGQLFLRNLP VLATKVKMLM VGNLGEEIGN LEHLGWILRG PAVCKKITEH EKCHINILDK LTAFFGIMPR GTTPRAPVRF PTSLLKVRRG LETAWAYTHQ GGISSVDHVT AGKDLLVCDS MGRTRVVCQS NNRLTDETEY GVKTDSGCPD GARCYVLNPE AVNISGSKGA VVHLQKTGGE FTCVTASGTP AFFDLKNLKG WSGLPIFEAS SGRVVGRVKV GKNEESKPTK IMSGIQTVSK NRADLTEMVK KITSMNRGDF KQITLATGAG KTTELPKAVI EEIGRHKRVL VLIPLRAAAE SVYQYMRLKH PSISFNLRIG DMKEGDMATG ITYASYGYFC QMPQPKLRAA MVEYSYIFLD EYHCATPEQL AIIGKIHRFS ESIRVVAMTA TPAGSVTTTG QKHPIEEFIA PEVMKGEDLG SQFLDIAGLK IPVDEMKGNM LVFVPTRNMA VEVAKKLKAK GYNSGYYYSG EDPANLRVVT SQSPYVIVAT NAIESGVTLP DLDTVIDTGL KCEKRVRVSS KIPFIVTGLK RMAVTVGEQA QRRGRVGRVK PGRYYRSQET ATGSKDYHYD LLQAQRYGIE DGINVTKSFR EMNYDWSLYE EDSLLITQLE ILNNLLISED LPAAVKNIMA RTDHPEPIQL AYNSYEVQVP VLFPKIRNGE VTDTYENYSF LNARKLGEDV PVYIYATEDE DLAVDLLGLD WPDPGNQQVV ETGKALKQVT GLSSAENALL VALFGYVGYQ ALSKRHVPMI TDIYTIEDQR LEDTTHLQYA PNAIKTDGTE TELKELASGD VEKIMGAISD YAAGGLEFVK SQAEKIKTAP LFKENAEAAK GYVQKFIDSL IENKEEIIRY GLWGTHTALY KSIAARLGHE TAFATLVLKW LAFGGESVSD HVKQAAVDLV VYYVMNKPSF PGDSETQQEG RRFVASLFIS ALATYTYKTW NYHNLSKVVE PALAYLPYAT SALKMFTPTR LESVVILSTT IYKTYLSIRK GKSDGLLGTG ISAAMEILSQ NPVSVGISVM LGVGAIAAHN AIESSEQKRT LLMKVFVKNF LDQAATDELV KENPEKIIMA LFEAVQTIGN PLRLIYHLYG VYYKGWEAKE LSERTAGRNL FTLIMFEAFE LLGMDSQGKI RNLSGNYILD LIYGLHKQIN RGLKKMVLGW APAPFSCDWT PSDERIRLPT DNYLRVETRC PCGYEMKAFK NVGGKLTKVE ESGPFLCRNR PGRGPVNYRV TKYYDDNLRE IKPVAKLEGQ VEHYYKGVTA KIDYSKGKML LATDKWEVEH GVITRLAKRY TGVGFNGAYL GDEPNHRALV ERDCATITKN TVQFLKMKKG CAFTYDLTIS NLTRLIELVH RNNLEEKEIP TATVTTWLAY TFVNEDVGTI KPVLGERVIP DPVVDINLQP EVQVDTSEVG ITIIGRETLM TTGVTPVLEK VEPDASDNQN SVKIGLDEGN YPGPGIQTHT LTEEIHNRDA RPFIMILGSR NSISNRAKTA RNINLYTGND PREIRDLMAA GRMLVVALRD VDPELSEMVD FKGTFLDREA LEALSLGQPK PKQVTKEAVR NLIEQKKDVE IPNWFASDDP VFLEVALKND KYYLVGDVGE LKDQAKALGA TDQTRIIKEV GSRTYAMKLS SWFLKASNKQ MSLTPLFEEL LLRCPPATKS NKGHMASAYQ LAQGNWEPLG CGVHLGTIPA RRVKIHPYEA YLKLKDFIEE EEKKPRVKDT VIREHNKWIL KKIRFQGNLN TKKMLNPGKL SEQLDREGRK RNIYNHQIGT IMSSAGIRLE KLPIVRAQTD TKTFHEAIRD KIDKSENRQN PELHNKLLEI FHTIAQPTLK HTYGEVTWEQ LEAGVNRKGA AGFLEKKNIG EVLDSEKHLV EQLVRDLKAG RKIKYYETAI PKNEKRDVSD DWQAGDLVVE KRPRVIQYPE AKTRLAITKV MYNWVKQQPV VIPGYEGKTP LFNIFDKVRK EWDSFNEPVA VSFDTKAWDT QVTSKDLQLI GEIQKYYYKK EWHKFIDTIT DHMTEVPVIT ADGEVYIRNG QRGSGQPDTS AGNSMLNVLT MMYGFCESTG VPYKSFNRVA RIHVCGDDGF LITEKGLGLK FANKGMQILH EAGKPQKITE GEKMKVAYRF EDIEFCSHTP VPVRWSDNTS SHMAGRDTAV ILSKMATRLD SSGERGTTAY EKAVAFSFLL MYSWNPLVRR ICLLVLSQQP ETDPSKHATY YYKGDPIGAY KDVIGRNLSE LKRTGFEKLA NLNLSLSTLG VWTKHTSKRI IQDCVAIGKE EGNWLVKPDR LISSKTGHLY IPDKGFTLQG KHYEQLQLRT ETNPVMGVGT ERYKLGPIVN LLLRRLKILL MTAVGVSS //