ID POLG_BVDVN Reviewed; 3988 AA. AC P19711; DT 01-FEB-1991, integrated into UniProtKB/Swiss-Prot. DT 01-FEB-1996, sequence version 2. DT 01-MAY-2007, entry version 72. DE Genome polyprotein [Contains: N-terminal protease (EC 3.4.22.-) (N- DE pro) (Autoprotease p20); Capsid protein C; E(rns) glycoprotein DE (gp44/48); Envelope glycoprotein E1 (gp33); Envelope glycoprotein E2 DE (gp55); p7; Non-structural protein 2-3; Non-structural protein 2 DE (NS2); Polyprotein protease/helicase NS3 (EC 3.4.21.113) (NTPase); DE Non-structural protein 4A (NS4A); Non-structural protein 4B (NS4B); DE Non-structural protein 5A (NS5A); RNA-directed RNA polymerase DE (EC 2.7.7.48) (NS5B)]. OS Bovine viral diarrhea virus (isolate NADL) (BVDV) (Mucosal disease OS virus). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Pestivirus. OX NCBI_TaxID=11100; OH NCBI_TaxID=9913; Bos taurus (Bovine). RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=88265858; PubMed=2838957; DOI=10.1016/0042-6822(88)90672-1; RA Collett M.S., Larson R., Gold C., Strick D., Anderson D.K., RA Purchio A.F.; RT "Molecular cloning and nucleotide sequence of the pestivirus bovine RT viral diarrhea virus."; RL Virology 165:191-199(1988). RN [2] RP GENOMIC ORGANIZATION. RX MEDLINE=88265859; PubMed=2838958; DOI=10.1016/0042-6822(88)90673-3; RA Collett M.S., Larson R., Belzer S.K., Retzel E.; RT "Proteins encoded by bovine viral diarrhea virus: the genomic RT organization of a pestivirus."; RL Virology 165:200-208(1988). RN [3] RP PROCESSING OF POLYPROTEIN, AND PROTEIN SEQUENCE OF 2363-2376; RP 2427-2441; 2774-2788 AND 3270-3284. RX MEDLINE=97332366; PubMed=9188600; RA Xu J., Mendez E., Caron P.R., Lin C., Murcko M.A., Collett M.S., RA Rice C.M.; RT "Bovine viral diarrhea virus NS3 serine proteinase: polyprotein RT cleavage sites, cofactor requirements, and molecular model of an RT enzyme essential for pestivirus replication."; RL J. Virol. 71:5312-5322(1997). RN [4] RP SUBCELLULAR LOCATION. RX MEDLINE=99281893; PubMed=10355762; RA Weiland F., Weiland E., Unger G., Saalmuller A., Thiel H.-J.; RT "Localization of pestiviral envelope proteins E(rns) and E2 at the RT cell surface and on isolated particles."; RL J. Gen. Virol. 80:1157-1165(1999). RN [5] RP ROLE OF BOVINE LOW-DENSITY-LIPOPROTEIN RECEPTOR IN VIRUS ATTACHMENT TO RP HOST CELL. RX PubMed=10535997; DOI=10.1073/pnas.96.22.12766; RA Agnello V., Abel G., Elfahal M., Knight G.B., Zhang Q.X.; RT "Hepatitis C virus and other flaviviridae viruses enter cells via low RT density lipoprotein receptor."; RL Proc. Natl. Acad. Sci. U.S.A. 96:12766-12771(1999). RN [6] RP INTERACTION OF E(RNS) WITH CELL SURFACE GLYCOSAMINOGLYCANS. RX PubMed=10644844; RA Iqbal M., Flick-Smith H., McCauley J.W.; RT "Interactions of bovine viral diarrhoea virus glycoprotein E(rns) with RT cell surface glycosaminoglycans."; RL J. Gen. Virol. 81:451-459(2000). RN [7] RP FUNCTION OF NS2-3. RC STRAIN=Isolate NADL Jiv 90(-); RX PubMed=14963137; DOI=10.1128/JVI.78.5.2414-2425.2004; RA Agapov E.V., Murray C.L., Frolov I., Qu L., Myers T.M., Rice C.M.; RT "Uncleaved NS2-3 is required for production of infectious bovine viral RT diarrhea virus."; RL J. Virol. 78:2414-2425(2004). RN [8] RP ROLE OF BOVINE CD46/MCP IN VIRUS ATTACHMENT TO HOST CELL. RX PubMed=14747544; DOI=10.1128/JVI.78.4.1792-1799.2004; RA Maurer K., Krey T., Moennig V., Thiel H.-J., Ruemenapf T.; RT "CD46 is a cellular receptor for bovine viral diarrhea virus."; RL J. Virol. 78:1792-1799(2004). CC -!- FUNCTION: Uncleaved NS2-3 is required for production of infectious CC virus. NS3 is a multifunctional protein with helicase, NTPase and CC protease activity. NS4A is a cofactor for the NS3 protease CC activity. CC -!- FUNCTION: P7 forms a leader sequence to properly orient NS2 in the CC membrane. CC -!- FUNCTION: E(rns), E1 and E2 are responsible of cell attachment and CC subsequent fusion of viral and cellular membrane. Binding to CC target cell involves interactions with glycosaminoglycans and CC membranes proteins such as bovine CD46/MCP and low-density- CC lipoprotein receptor. CC -!- CATALYTIC ACTIVITY: Leu is conserved at position P1 for all four CC cleavage sites. Alanine is found at position P1' of the NS4A-NS4B CC cleavage site, whereas serine is found at position P1' of the NS3- CC NS4A, NS4B-NS5A and NS5A-NS5B cleavage sites. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- SUBCELLULAR LOCATION: E(rns) glycoprotein: Cell surface. Envelope CC glycoprotein E2: Cell surface. Non-structural protein 2: Membrane. CC Note=E(rns) and E2 glycoproteins are located at the surface of CC infected cells. NS2 is a membrane protein. CC -!- PTM: The E(rns) glycoprotein is heavily glycosylated. Forms CC disulfide-linked homodimers (By similarity). CC -!- PTM: The E1 and E2 envelope glycoproteins form disulfide-linked CC homodimers as well as heterodimers (By similarity). CC -!- PTM: The viral RNA of pestiviruses is expressed as a single CC polyprotein which undergoes post-translational proteolytic CC processing resulting in the production of at least eleven CC individual proteins. The N-terminal protease cleaves itself from CC the nascent polyprotein autocatalytically and thereby generates CC the N-terminus of the adjacent viral capsid protein C (By CC similarity). CC -!- PTM: Cleavage between E2 and p7 is partial (By similarity). CC -!- PTM: Cleavage between NS2 and NS3 is partial. CC -!- MISCELLANEOUS: BVDV is divided in two types: cytopathic and non- CC cytopathic. Both type of viruses can be found in animals suffering CC from mucosal disease, as a cytopathic BVDV can develop from a non- CC cytopathic virus within the infected animal by deletions, CC mutations or insertions. Both types express uncleaved NS2-3, but CC cytopathic also express NS3. The cytopathic NADL strain contains CC an insertion (Jiv 90) that potentiate the partial cleavage of NS2- CC 3. Removal of this insertion in the NADL Jiv 90(-) strain results CC in a non-cytopathic strain in which NS2-3 remains uncleaved. CC -!- SIMILARITY: Belongs to the pestiviruses polyprotein family. CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase C53 domain. CC -!- SIMILARITY: Contains 1 peptidase S31 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M31182; AAA42854.1; -; Genomic_RNA. DR PIR; A29198; GNWVBV. DR PDB; 1S48; X-ray; A=3340-3948. DR PDB; 1S49; X-ray; A=3340-3948. DR PDB; 1S4F; X-ray; A/B/C/D=3348-3948. DR MEROPS; C53.001; -. DR MEROPS; C74.001; -. DR MEROPS; S31.001; -. DR InterPro; IPR014001; DEAD-like_N. DR InterPro; IPR014021; Helic_SF1/SF2_ATP_bd. DR InterPro; IPR001650; Helicase_C. DR InterPro; IPR009003; Pept_Ser_Cys. DR InterPro; IPR008751; Peptidase_C53. DR InterPro; IPR000280; Peptidase_S31. DR InterPro; IPR002166; RNA_pol_HCV. DR InterPro; IPR007094; RNA_pol_PSvir. DR InterPro; IPR001568; RNase_T2. DR Gene3D; G3DSA:3.40.50.300; G3DSA:3.40.50.300; 2. DR PANTHER; PTHR11821; PTHR11821; 1. DR Pfam; PF00271; Helicase_C; 1. DR Pfam; PF05550; Peptidase_C53; 1. DR Pfam; PF05578; Peptidase_S31; 1. DR Pfam; PF00998; RdRP_3; 1. DR PRINTS; PR00729; CDVENDOPTASE. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; 1. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. DR PROSITE; PS00531; RNASE_T2_2; 1. KW 3D-structure; ATP-binding; Direct protein sequencing; Glycoprotein; KW Helicase; Hydrolase; Membrane; Nucleotide-binding; KW Nucleotidyltransferase; Protease; RNA replication; KW RNA-directed RNA polymerase; Serine protease; Thiol protease; KW Transferase; Transmembrane. FT CHAIN 1 168 N-terminal protease (By similarity). FT /FTId=PRO_0000038024. FT CHAIN 169 270 Capsid protein C (By similarity). FT /FTId=PRO_0000038025. FT CHAIN 271 497 E(rns) glycoprotein (By similarity). FT /FTId=PRO_0000038026. FT CHAIN 498 659 Envelope glycoprotein E1 (By similarity). FT /FTId=PRO_0000038027. FT CHAIN 660 1066 Envelope glycoprotein E2 (By similarity). FT /FTId=PRO_0000038028. FT CHAIN 1067 1136 p7 (By similarity). FT /FTId=PRO_0000038029. FT CHAIN 1137 2362 Non-structural protein 2-3. FT /FTId=PRO_0000038030. FT CHAIN 1137 1679 Non-structural protein 2. FT /FTId=PRO_0000038031. FT CHAIN 1680 2362 Polyprotein protease/helicase NS3. FT /FTId=PRO_0000038032. FT CHAIN 2363 2426 Non-structural protein 4A (By FT similarity). FT /FTId=PRO_0000038033. FT CHAIN 2427 2773 Non-structural protein 4B (By FT similarity). FT /FTId=PRO_0000038034. FT CHAIN 2774 3269 Non-structural protein 5A (By FT similarity). FT /FTId=PRO_0000038035. FT CHAIN 3270 3988 RNA-directed RNA polymerase (By FT similarity). FT /FTId=PRO_0000038036. FT TRANSMEM 1144 1164 Potential. FT TRANSMEM 1189 1209 Potential. FT TRANSMEM 1217 1237 Potential. FT TRANSMEM 1247 1267 Potential. FT TRANSMEM 1281 1301 Potential. FT TRANSMEM 1360 1380 Potential. FT TRANSMEM 1658 1678 Potential. FT DOMAIN 1 168 Peptidase C53. FT DOMAIN 1892 2050 Helicase ATP-binding. FT DOMAIN 2068 2233 Helicase C-terminal. FT DOMAIN 3608 3731 RdRp catalytic. FT ACT_SITE 22 22 For N-terminal protease activity (By FT similarity). FT ACT_SITE 49 49 For N-terminal protease activity (By FT similarity). FT ACT_SITE 69 69 For N-terminal protease activity (By FT similarity). FT ACT_SITE 1748 1748 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1785 1785 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1842 1842 Charge relay system; for serine protease FT NS3 activity (By similarity). FT SITE 168 169 Cleavage; by autolysis (By similarity). FT SITE 270 271 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 497 498 Cleavage. FT SITE 659 660 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 1066 1067 Cleavage; partial (by host signal FT peptidase) (By similarity). FT SITE 1136 1137 Cleavage (by host signal peptidase) (By FT similarity). FT SITE 1679 1680 Cleavage; partial (in cytopathic FT strains). FT SITE 2362 2363 Cleavage (by NS3) (By similarity). FT SITE 2426 2427 Cleavage (by NS3) (By similarity). FT SITE 2773 2774 Cleavage (by NS3) (By similarity). FT SITE 3269 3270 Cleavage (by NS3) (By similarity). FT CARBOHYD 272 272 N-linked (GlcNAc...) (Potential). FT CARBOHYD 281 281 N-linked (GlcNAc...) (Potential). FT CARBOHYD 296 296 N-linked (GlcNAc...) (Potential). FT CARBOHYD 335 335 N-linked (GlcNAc...) (Potential). FT CARBOHYD 365 365 N-linked (GlcNAc...) (Potential). FT CARBOHYD 370 370 N-linked (GlcNAc...) (Potential). FT CARBOHYD 413 413 N-linked (GlcNAc...) (Potential). FT CARBOHYD 487 487 N-linked (GlcNAc...) (Potential). FT CARBOHYD 597 597 N-linked (GlcNAc...) (Potential). FT CARBOHYD 809 809 N-linked (GlcNAc...) (Potential). FT CARBOHYD 878 878 N-linked (GlcNAc...) (Potential). FT CARBOHYD 922 922 N-linked (GlcNAc...) (Potential). FT CARBOHYD 990 990 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1357 1357 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1419 1419 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1451 1451 N-linked (GlcNAc...) (Potential). FT CARBOHYD 1803 1803 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2224 2224 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2307 2307 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2584 2584 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2772 2772 N-linked (GlcNAc...) (Potential). FT CARBOHYD 2981 2981 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3778 3778 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3867 3867 N-linked (GlcNAc...) (Potential). FT CARBOHYD 3883 3883 N-linked (GlcNAc...) (Potential). FT HELIX 3364 3372 FT STRAND 3373 3376 FT STRAND 3382 3384 FT STRAND 3388 3392 FT STRAND 3402 3404 FT HELIX 3406 3414 FT HELIX 3419 3421 FT STRAND 3422 3427 FT HELIX 3431 3440 FT HELIX 3453 3462 FT HELIX 3467 3469 FT HELIX 3478 3481 FT TURN 3482 3484 FT HELIX 3501 3504 FT HELIX 3507 3518 FT STRAND 3527 3531 FT STRAND 3535 3537 FT HELIX 3539 3543 FT STRAND 3549 3551 FT STRAND 3555 3558 FT HELIX 3563 3570 FT HELIX 3572 3575 FT HELIX 3586 3588 FT HELIX 3591 3593 FT HELIX 3594 3603 FT STRAND 3605 3612 FT HELIX 3618 3621 FT HELIX 3624 3637 FT HELIX 3640 3642 FT HELIX 3643 3652 FT STRAND 3655 3660 FT STRAND 3663 3668 FT HELIX 3679 3699 FT HELIX 3703 3705 FT HELIX 3706 3709 FT STRAND 3710 3715 FT STRAND 3718 3724 FT HELIX 3725 3741 FT STRAND 3756 3759 FT HELIX 3760 3762 FT STRAND 3768 3775 FT STRAND 3780 3785 FT HELIX 3788 3797 FT STRAND 3802 3804 FT STRAND 3808 3810 FT HELIX 3811 3822 FT HELIX 3826 3837 FT STRAND 3846 3854 FT HELIX 3856 3864 FT HELIX 3868 3870 FT STRAND 3871 3874 FT HELIX 3876 3882 FT HELIX 3885 3888 FT HELIX 3896 3908 FT STRAND 3910 3912 FT TURN 3915 3918 FT HELIX 3920 3926 FT STRAND 3935 3939 SQ SEQUENCE 3988 AA; 449163 MW; 4474212F338661B8 CRC64; MELITNELLY KTYKQKPVGV EEPVYDQAGD PLFGERGAVH PQSTLKLPHK RGERDVPTNL ASLPKRGDCR SGNSRGPVSG IYLKPGPLFY QDYKGPVYHR APLELFEEGS MCETTKRIGR VTGSDGKLYH IYVCIDGCII IKSATRSYQR VFRWVHNRLD CPLWVTTCSD TKEEGATKKK TQKPDRLERG KMKIVPKESE KDSKTKPPDA TIVVEGVKYQ VRKKGKTKSK NTQDGLYHNK NKPQESRKKL EKALLAWAII AIVLFQVTMG ENITQWNLQD NGTEGIQRAM FQRGVNRSLH GIWPEKICTG VPSHLATDIE LKTIHGMMDA SEKTNYTCCR LQRHEWNKHG WCNWYNIEPW ILVMNRTQAN LTEGQPPREC AVTCRYDRAS DLNVVTQARD SPTPLTGCKK GKNFSFAGIL MRGPCNFEIA ASDVLFKEHE RISMFQDTTL YLVDGLTNSL EGARQGTAKL TTWLGKQLGI LGKKLENKSK TWFGAYAASP YCDVDRKIGY IWYTKNCTPA CLPKNTKIVG PGKFGTNAED GKILHEMGGH LSEVLLLSLV VLSDFAPETA SVMYLILHFS IPQSHVDVMD CDKTQLNLTV ELTTAEVIPG SVWNLGKYVC IRPNWWPYET TVVLAFEEVS QVVKLVLRAL RDLTRIWNAA TTTAFLVCLV KIVRGQMVQG ILWLLLITGV QGHLDCKPEF SYAIAKDERI GQLGAEGLTT TWKEYSPGMK LEDTMVIAWC EDGKLMYLQR CTRETRYLAI LHTRALPTSV VFKKLFDGRK QEDVVEMNDN FEFGLCPCDA KPIVRGKFNT TLLNGPAFQM VCPIGWTGTV SCTSFNMDTL ATTVVRTYRR SKPFPHRQGC ITQKNLGEDL HNCILGGNWT CVPGDQLLYK GGSIESCKWC GYQFKESEGL PHYPIGKCKL ENETGYRLVD STSCNREGVA IVPQGTLKCK IGKTTVQVIA MDTKLGPMPC RPYEIISSEG PVEKTACTFN YTKTLKNKYF EPRDSYFQQY MLKGEYQYWF DLEVTDHHRD YFAESILVVV VALLGGRYVL WLLVTYMVLS EQKALGIQYG SGEVVMMGNL LTHNNIEVVT YFLLLYLLLR EESVKKWVLL LYHILVVHPI KSVIVILLMI GDVVKADSGG QEYLGKIDLC FTTVVLIVIG LIIARRDPTI VPLVTIMAAL RVTELTHQPG VDIAVAVMTI TLLMVSYVTD YFRYKKWLQC ILSLVSAVFL IRSLIYLGRI EMPEVTIPNW RPLTLILLYL ISTTIVTRWK VDVAGLLLQC VPILLLVTTL WADFLTLILI LPTYELVKLY YLKTVRTDTE RSWLGGIDYT RVDSIYDVDE SGEGVYLFPS RQKAQGNFSI LLPLIKATLI SCVSSKWQLI YMSYLTLDFM YYMHRKVIEE ISGGTNIISR LVAALIELNW SMEEEESKGL KKFYLLSGRL RNLIIKHKVR NETVASWYGE EEVYGMPKIM TIIKASTLSK SRHCIICTVC EGREWKGGTC PKCGRHGKPI TCGMSLADFE ERHYKRIFIR EGNFEGMCSR CQGKHRRFEM DREPKSARYC AECNRLHPAE EGDFWAESSM LGLKITYFAL MDGKVYDITE WAGCQRVGIS PDTHRVPCHI SFGSRMPFRQ EYNGFVQYTA RGQLFLRNLP VLATKVKMLM VGNLGEEIGN LEHLGWILRG PAVCKKITEH EKCHINILDK LTAFFGIMPR GTTPRAPVRF PTSLLKVRRG LETAWAYTHQ GGISSVDHVT AGKDLLVCDS MGRTRVVCQS NNRLTDETEY GVKTDSGCPD GARCYVLNPE AVNISGSKGA VVHLQKTGGE FTCVTASGTP AFFDLKNLKG WSGLPIFEAS SGRVVGRVKV GKNEESKPTK IMSGIQTVSK NRADLTEMVK KITSMNRGDF KQITLATGAG KTTELPKAVI EEIGRHKRVL VLIPLRAAAE SVYQYMRLKH PSISFNLRIG DMKEGDMATG ITYASYGYFC QMPQPKLRAA MVEYSYIFLD EYHCATPEQL AIIGKIHRFS ESIRVVAMTA TPAGSVTTTG QKHPIEEFIA PEVMKGEDLG SQFLDIAGLK IPVDEMKGNM LVFVPTRNMA VEVAKKLKAK GYNSGYYYSG EDPANLRVVT SQSPYVIVAT NAIESGVTLP DLDTVIDTGL KCEKRVRVSS KIPFIVTGLK RMAVTVGEQA QRRGRVGRVK PGRYYRSQET ATGSKDYHYD LLQAQRYGIE DGINVTKSFR EMNYDWSLYE EDSLLITQLE ILNNLLISED LPAAVKNIMA RTDHPEPIQL AYNSYEVQVP VLFPKIRNGE VTDTYENYSF LNARKLGEDV PVYIYATEDE DLAVDLLGLD WPDPGNQQVV ETGKALKQVT GLSSAENALL VALFGYVGYQ ALSKRHVPMI TDIYTIEDQR LEDTTHLQYA PNAIKTDGTE TELKELASGD VEKIMGAISD YAAGGLEFVK SQAEKIKTAP LFKENAEAAK GYVQKFIDSL IENKEEIIRY GLWGTHTALY KSIAARLGHE TAFATLVLKW LAFGGESVSD HVKQAAVDLV VYYVMNKPSF PGDSETQQEG RRFVASLFIS ALATYTYKTW NYHNLSKVVE PALAYLPYAT SALKMFTPTR LESVVILSTT IYKTYLSIRK GKSDGLLGTG ISAAMEILSQ NPVSVGISVM LGVGAIAAHN AIESSEQKRT LLMKVFVKNF LDQAATDELV KENPEKIIMA LFEAVQTIGN PLRLIYHLYG VYYKGWEAKE LSERTAGRNL FTLIMFEAFE LLGMDSQGKI RNLSGNYILD LIYGLHKQIN RGLKKMVLGW APAPFSCDWT PSDERIRLPT DNYLRVETRC PCGYEMKAFK NVGGKLTKVE ESGPFLCRNR PGRGPVNYRV TKYYDDNLRE IKPVAKLEGQ VEHYYKGVTA KIDYSKGKML LATDKWEVEH GVITRLAKRY TGVGFNGAYL GDEPNHRALV ERDCATITKN TVQFLKMKKG CAFTYDLTIS NLTRLIELVH RNNLEEKEIP TATVTTWLAY TFVNEDVGTI KPVLGERVIP DPVVDINLQP EVQVDTSEVG ITIIGRETLM TTGVTPVLEK VEPDASDNQN SVKIGLDEGN YPGPGIQTHT LTEEIHNRDA RPFIMILGSR NSISNRAKTA RNINLYTGND PREIRDLMAA GRMLVVALRD VDPELSEMVD FKGTFLDREA LEALSLGQPK PKQVTKEAVR NLIEQKKDVE IPNWFASDDP VFLEVALKND KYYLVGDVGE LKDQAKALGA TDQTRIIKEV GSRTYAMKLS SWFLKASNKQ MSLTPLFEEL LLRCPPATKS NKGHMASAYQ LAQGNWEPLG CGVHLGTIPA RRVKIHPYEA YLKLKDFIEE EEKKPRVKDT VIREHNKWIL KKIRFQGNLN TKKMLNPGKL SEQLDREGRK RNIYNHQIGT IMSSAGIRLE KLPIVRAQTD TKTFHEAIRD KIDKSENRQN PELHNKLLEI FHTIAQPTLK HTYGEVTWEQ LEAGVNRKGA AGFLEKKNIG EVLDSEKHLV EQLVRDLKAG RKIKYYETAI PKNEKRDVSD DWQAGDLVVE KRPRVIQYPE AKTRLAITKV MYNWVKQQPV VIPGYEGKTP LFNIFDKVRK EWDSFNEPVA VSFDTKAWDT QVTSKDLQLI GEIQKYYYKK EWHKFIDTIT DHMTEVPVIT ADGEVYIRNG QRGSGQPDTS AGNSMLNVLT MMYGFCESTG VPYKSFNRVA RIHVCGDDGF LITEKGLGLK FANKGMQILH EAGKPQKITE GEKMKVAYRF EDIEFCSHTP VPVRWSDNTS SHMAGRDTAV ILSKMATRLD SSGERGTTAY EKAVAFSFLL MYSWNPLVRR ICLLVLSQQP ETDPSKHATY YYKGDPIGAY KDVIGRNLSE LKRTGFEKLA NLNLSLSTLG VWTKHTSKRI IQDCVAIGKE EGNWLVKPDR LISSKTGHLY IPDKGFTLQG KHYEQLQLRT ETNPVMGVGT ERYKLGPIVN LLLRRLKILL MTAVGVSS //