ID POLG_BVDVN Reviewed; 3988 AA. AC P19711; DT 01-FEB-1991, integrated into UniProtKB/Swiss-Prot. DT 01-FEB-1996, sequence version 2. DT 09-FEB-2010, entry version 96. DE RecName: Full=Genome polyprotein; DE Contains: DE RecName: Full=N-terminal protease; DE Short=N-pro; DE EC=3.4.22.-; DE AltName: Full=Autoprotease p20; DE Contains: DE RecName: Full=Capsid protein C; DE Contains: DE RecName: Full=E(rns) glycoprotein; DE AltName: Full=gp44/48; DE Contains: DE RecName: Full=Envelope glycoprotein E1; DE AltName: Full=gp33; DE Contains: DE RecName: Full=Envelope glycoprotein E2; DE AltName: Full=gp55; DE Contains: DE RecName: Full=p7; DE Contains: DE RecName: Full=Non-structural protein 2-3; DE Contains: DE RecName: Full=Cysteine protease NS2; DE EC=3.4.22.-; DE AltName: Full=Non-structural protein 2; DE Contains: DE RecName: Full=Serine protease/NTPase/helicase NS3; DE EC=3.4.21.113; DE EC=3.6.1.15; DE EC=3.6.1.-; DE AltName: Full=Non-structural protein 3; DE Contains: DE RecName: Full=Non-structural protein 4A; DE Short=NS4A; DE Contains: DE RecName: Full=Non-structural protein 4B; DE Short=NS4B; DE Contains: DE RecName: Full=Non-structural protein 5A; DE Short=NS5A; DE Contains: DE RecName: Full=RNA-directed RNA polymerase; DE EC=2.7.7.48; DE AltName: Full=NS5B; OS Bovine viral diarrhea virus (isolate NADL) (BVDV) (Mucosal disease OS virus). OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; OC Pestivirus. OX NCBI_TaxID=11100; OH NCBI_TaxID=9913; Bos taurus (Bovine). RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX MEDLINE=88265858; PubMed=2838957; DOI=10.1016/0042-6822(88)90672-1; RA Collett M.S., Larson R., Gold C., Strick D., Anderson D.K., RA Purchio A.F.; RT "Molecular cloning and nucleotide sequence of the pestivirus bovine RT viral diarrhea virus."; RL Virology 165:191-199(1988). RN [2] RP GENOMIC ORGANIZATION. RX MEDLINE=88265859; PubMed=2838958; DOI=10.1016/0042-6822(88)90673-3; RA Collett M.S., Larson R., Belzer S.K., Retzel E.; RT "Proteins encoded by bovine viral diarrhea virus: the genomic RT organization of a pestivirus."; RL Virology 165:200-208(1988). RN [3] RP PROTEIN SEQUENCE OF 2363-2376; 2427-2441; 2774-2788 AND 3270-3284, AND RP PROTEOLYTIC PROCESSING OF POLYPROTEIN. RX MEDLINE=97332366; PubMed=9188600; RA Xu J., Mendez E., Caron P.R., Lin C., Murcko M.A., Collett M.S., RA Rice C.M.; RT "Bovine viral diarrhea virus NS3 serine proteinase: polyprotein RT cleavage sites, cofactor requirements, and molecular model of an RT enzyme essential for pestivirus replication."; RL J. Virol. 71:5312-5322(1997). RN [4] RP SUBCELLULAR LOCATION. RX MEDLINE=99281893; PubMed=10355762; RA Weiland F., Weiland E., Unger G., Saalmuller A., Thiel H.-J.; RT "Localization of pestiviral envelope proteins E(rns) and E2 at the RT cell surface and on isolated particles."; RL J. Gen. Virol. 80:1157-1165(1999). RN [5] RP ROLE OF BOVINE LOW-DENSITY-LIPOPROTEIN RECEPTOR IN VIRUS ATTACHMENT TO RP HOST CELL. RX PubMed=10535997; DOI=10.1073/pnas.96.22.12766; RA Agnello V., Abel G., Elfahal M., Knight G.B., Zhang Q.X.; RT "Hepatitis C virus and other flaviviridae viruses enter cells via low RT density lipoprotein receptor."; RL Proc. Natl. Acad. Sci. U.S.A. 96:12766-12771(1999). RN [6] RP INTERACTION OF E(RNS) WITH CELL SURFACE GLYCOSAMINOGLYCANS. RX PubMed=10644844; RA Iqbal M., Flick-Smith H., McCauley J.W.; RT "Interactions of bovine viral diarrhoea virus glycoprotein E(rns) with RT cell surface glycosaminoglycans."; RL J. Gen. Virol. 81:451-459(2000). RN [7] RP FUNCTION OF NS2-3. RC STRAIN=Isolate NADL Jiv 90(-); RX PubMed=14963137; DOI=10.1128/JVI.78.5.2414-2425.2004; RA Agapov E.V., Murray C.L., Frolov I., Qu L., Myers T.M., Rice C.M.; RT "Uncleaved NS2-3 is required for production of infectious bovine viral RT diarrhea virus."; RL J. Virol. 78:2414-2425(2004). RN [8] RP ROLE OF BOVINE CD46/MCP IN VIRUS ATTACHMENT TO HOST CELL. RX PubMed=14747544; DOI=10.1128/JVI.78.4.1792-1799.2004; RA Maurer K., Krey T., Moennig V., Thiel H.-J., Ruemenapf T.; RT "CD46 is a cellular receptor for bovine viral diarrhea virus."; RL J. Virol. 78:1792-1799(2004). CC -!- FUNCTION: E(rns), E1 and E2 are responsible of cell attachment and CC subsequent fusion of viral and cellular membrane. Binding to CC target cell involves interactions with glycosaminoglycans and CC membranes proteins such as bovine CD46/MCP and low-density- CC lipoprotein receptor. CC -!- FUNCTION: P7 forms a leader sequence to properly orient NS2 in the CC membrane (By similarity). CC -!- FUNCTION: Uncleaved NS2-3 is required for production of infectious CC virus. CC -!- FUNCTION: NS2 protease seems to play a vital role in viral RNA CC replication control and in the pathogenicity of the virus. CC -!- FUNCTION: NS3 displays three enzymatic activities: serine CC protease, NTPase and RNA helicase. CC -!- FUNCTION: NS4A is a cofactor for the NS3 protease activity (By CC similarity). CC -!- FUNCTION: RNA-directed RNA polymerase NS5 replicates the viral (+) CC and (-) genome. CC -!- CATALYTIC ACTIVITY: Leu is conserved at position P1 for all four CC cleavage sites. Alanine is found at position P1' of the NS4A-NS4B CC cleavage site, whereas serine is found at position P1' of the NS3- CC NS4A, NS4B-NS5A and NS5A-NS5B cleavage sites. CC -!- CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate CC + RNA(n+1). CC -!- CATALYTIC ACTIVITY: NTP + H(2)O = NDP + phosphate. CC -!- SUBUNIT: The E(rns) glycoprotein is found as a homodimer; CC disulfide-linked (By similarity). The E1 and E2 envelope CC glycoproteins form disulfide-linked homodimers as well as CC heterodimers (By similarity). CC -!- SUBCELLULAR LOCATION: E(rns) glycoprotein: Host membrane; CC Peripheral membrane protein. Note=The C-terminus membrane anchor CC of Erns represents an amphipathic helix embedded in plane into the CC membrane (By similarity). CC -!- SUBCELLULAR LOCATION: Envelope glycoprotein E2: Host cell surface. CC -!- SUBCELLULAR LOCATION: Cysteine protease NS2: Host membrane; Multi- CC pass membrane protein (Potential). CC -!- PTM: The E(rns) glycoprotein is heavily glycosylated (By CC similarity). CC -!- PTM: The viral RNA of pestiviruses is expressed as a single CC polyprotein which undergoes post-translational proteolytic CC processing resulting in the production of at least eleven CC individual proteins. The N-terminal protease cleaves itself from CC the nascent polyprotein autocatalytically and thereby generates CC the N-terminus of the adjacent viral capsid protein C (By CC similarity). CC -!- PTM: Cleavage between E2 and p7 is partial (By similarity). CC -!- PTM: Cleavage between NS2 and NS3 is partial. CC -!- MISCELLANEOUS: BVDV is divided in two types: cytopathic and non- CC cytopathic. Both types of viruses can be found in animals CC suffering from mucosal disease, as a cytopathic BVDV can develop CC from a non-cytopathic virus within the infected animal by CC deletions, mutations or insertions. Both types express uncleaved CC NS2-3, but cytopathic strains also express NS3. The cytopathic CC NADL strain contains an insertion (Jiv 90) that potentiate the CC partial cleavage of NS2-3. Removal of this insertion in the NADL CC Jiv 90(-) strain results in a non-cytopathic strain in which NS2-3 CC remains uncleaved. CC -!- SIMILARITY: Belongs to the pestiviruses polyprotein family. CC -!- SIMILARITY: Contains 1 helicase ATP-binding domain. CC -!- SIMILARITY: Contains 1 helicase C-terminal domain. CC -!- SIMILARITY: Contains 1 peptidase C53 domain. CC -!- SIMILARITY: Contains 1 peptidase C74 domain. CC -!- SIMILARITY: Contains 1 peptidase S31 domain. CC -!- SIMILARITY: Contains 1 RdRp catalytic domain. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; M31182; AAA42854.1; -; Genomic_RNA. DR PIR; A29198; GNWVBV. DR RefSeq; NP_040937.1; -. DR PDB; 1S48; X-ray; 3.00 A; A=3340-3948. DR PDB; 1S49; X-ray; 3.00 A; A=3340-3948. DR PDB; 1S4F; X-ray; 3.00 A; A/B/C/D=3348-3948. DR PDBsum; 1S48; -. DR PDBsum; 1S49; -. DR PDBsum; 1S4F; -. DR SMR; P19711; 1901-2301, 2774-2801. DR MEROPS; C53.001; -. DR MEROPS; C74.001; -. DR GeneID; 1489735; -. DR GO; GO:0019898; C:extrinsic to membrane; IEA:UniProtKB-SubCell. DR GO; GO:0016021; C:integral to membrane; IEA:UniProtKB-SubCell. DR GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW. DR GO; GO:0008026; F:ATP-dependent helicase activity; IEA:InterPro. DR GO; GO:0008234; F:cysteine-type peptidase activity; IEA:UniProtKB-KW. DR GO; GO:0033897; F:ribonuclease T2 activity; IEA:InterPro. DR GO; GO:0003723; F:RNA binding; IEA:InterPro. DR GO; GO:0003968; F:RNA-directed RNA polymerase activity; IEA:UniProtKB-KW. DR GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro. DR GO; GO:0006508; P:proteolysis; IEA:InterPro. DR GO; GO:0006410; P:transcription, RNA-dependent; IEA:UniProtKB-KW. DR GO; GO:0019079; P:viral genome replication; IEA:InterPro. DR GO; GO:0019082; P:viral protein processing; IEA:InterPro. DR InterPro; IPR014001; DEAD-like_N. DR InterPro; IPR001650; DNA/RNA_helicase_C. DR InterPro; IPR014021; Helicase_SF1/SF2_ATP-bd. DR InterPro; IPR015609; Hsp40/DnaJ_Rel. DR InterPro; IPR008751; Peptidase_C53. DR InterPro; IPR000280; Peptidase_S31. DR InterPro; IPR007094; RNA-dir_pol_PSvirus. DR InterPro; IPR002166; RNA_pol_HCV. DR InterPro; IPR001568; RNase_T2. DR InterPro; IPR018188; RNase_T2_AS. DR InterPro; IPR009003; Ser/Cys_Pept_Trypsin-like. DR PANTHER; PTHR11821; Hsp40/DnaJ_Rel; 1. DR Pfam; PF00271; Helicase_C; 1. DR Pfam; PF05550; Peptidase_C53; 1. DR Pfam; PF05578; Peptidase_S31; 1. DR Pfam; PF00998; RdRP_3; 1. DR PRINTS; PR00729; CDVENDOPTASE. DR ProDom; PD003091; Peptidase_C53; 1. DR SMART; SM00487; DEXDc; 1. DR SMART; SM00490; HELICc; 1. DR PROSITE; PS51192; HELICASE_ATP_BIND_1; 1. DR PROSITE; PS51194; HELICASE_CTER; 1. DR PROSITE; PS50507; RDRP_SSRNA_POS; 1. DR PROSITE; PS00531; RNASE_T2_2; 1. PE 1: Evidence at protein level; KW 3D-structure; ATP-binding; Complete proteome; KW Direct protein sequencing; Disulfide bond; Glycoprotein; Helicase; KW Host membrane; Hydrolase; Membrane; Nucleotide-binding; KW Nucleotidyltransferase; Protease; RNA replication; KW RNA-directed RNA polymerase; Serine protease; Thiol protease; KW Transferase; Transmembrane. FT CHAIN 1 168 N-terminal protease (By similarity). FT /FTId=PRO_0000038024. FT CHAIN 169 270 Capsid protein C (By similarity). FT /FTId=PRO_0000038025. FT CHAIN 271 497 E(rns) glycoprotein (By similarity). FT /FTId=PRO_0000038026. FT CHAIN 498 659 Envelope glycoprotein E1 (By similarity). FT /FTId=PRO_0000038027. FT CHAIN 660 1066 Envelope glycoprotein E2 (By similarity). FT /FTId=PRO_0000038028. FT CHAIN 1067 1136 p7 (By similarity). FT /FTId=PRO_0000038029. FT CHAIN 1137 2362 Non-structural protein 2-3. FT /FTId=PRO_0000038030. FT CHAIN 1137 1679 Cysteine protease NS2. FT /FTId=PRO_0000038031. FT CHAIN 1680 2362 Serine protease/NTPase/helicase NS3. FT /FTId=PRO_0000038032. FT CHAIN 2363 2426 Non-structural protein 4A (By FT similarity). FT /FTId=PRO_0000038033. FT CHAIN 2427 2773 Non-structural protein 4B (By FT similarity). FT /FTId=PRO_0000038034. FT CHAIN 2774 3269 Non-structural protein 5A (By FT similarity). FT /FTId=PRO_0000038035. FT CHAIN 3270 3988 RNA-directed RNA polymerase (By FT similarity). FT /FTId=PRO_0000038036. FT TRANSMEM 1144 1164 Potential. FT TRANSMEM 1189 1209 Potential. FT TRANSMEM 1217 1237 Potential. FT TRANSMEM 1247 1267 Potential. FT TRANSMEM 1281 1301 Potential. FT TRANSMEM 1360 1380 Potential. FT TRANSMEM 1658 1678 Potential. FT DOMAIN 1 168 Peptidase C53. FT DOMAIN 1441 1679 Peptidase C74. FT DOMAIN 1892 2050 Helicase ATP-binding. FT DOMAIN 2068 2233 Helicase C-terminal. FT DOMAIN 3608 3731 RdRp catalytic. FT ACT_SITE 22 22 For N-terminal protease activity (By FT similarity). FT ACT_SITE 49 49 For N-terminal protease activity (By FT similarity). FT ACT_SITE 69 69 For N-terminal protease activity (By FT similarity). FT ACT_SITE 1447 1447 For cysteine protease NS2 activity (By FT similarity). FT ACT_SITE 1461 1461 For cysteine protease NS2 activity (By FT similarity). FT ACT_SITE 1512 1512 For cysteine protease NS2 activity (By FT similarity). FT ACT_SITE 1748 1748 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1785 1785 Charge relay system; for serine protease FT NS3 activity (By similarity). FT ACT_SITE 1842 1842 Charge relay system; for serine protease FT NS3 activity (By similarity). FT SITE 168 169 Cleavage; by autolysis (By similarity). FT SITE 270 271 Cleavage; by host signal peptidase (By FT similarity). FT SITE 497 498 Cleavage. FT SITE 659 660 Cleavage; by host signal peptidase (By FT similarity). FT SITE 1066 1067 Cleavage; by host signal peptidase; FT partial (By similarity). FT SITE 1136 1137 Cleavage; by host signal peptidase (By FT similarity). FT SITE 1679 1680 Cleavage; partial; cysteine protease NS2. FT SITE 2362 2363 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 2426 2427 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 2773 2774 Cleavage; by serine protease NS3 (By FT similarity). FT SITE 3269 3270 Cleavage; by serine protease NS3 (By FT similarity). FT CARBOHYD 272 272 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 281 281 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 296 296 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 335 335 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 365 365 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 370 370 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 413 413 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 487 487 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 597 597 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 809 809 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 878 878 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 922 922 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 990 990 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 1357 1357 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 1419 1419 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 1451 1451 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 1803 1803 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 2224 2224 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 2307 2307 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 2584 2584 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 2772 2772 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 2981 2981 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 3778 3778 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 3867 3867 N-linked (GlcNAc...); by host FT (Potential). FT CARBOHYD 3883 3883 N-linked (GlcNAc...); by host FT (Potential). FT HELIX 3364 3372 FT STRAND 3373 3376 FT STRAND 3382 3384 FT STRAND 3388 3392 FT STRAND 3402 3404 FT HELIX 3406 3414 FT HELIX 3419 3421 FT STRAND 3422 3427 FT HELIX 3431 3440 FT HELIX 3453 3462 FT HELIX 3467 3469 FT HELIX 3478 3481 FT TURN 3482 3484 FT HELIX 3501 3504 FT HELIX 3507 3518 FT STRAND 3527 3531 FT STRAND 3535 3537 FT HELIX 3539 3543 FT STRAND 3549 3551 FT STRAND 3555 3558 FT HELIX 3563 3570 FT HELIX 3572 3575 FT HELIX 3586 3588 FT HELIX 3591 3593 FT HELIX 3594 3603 FT STRAND 3605 3612 FT HELIX 3618 3621 FT HELIX 3624 3637 FT HELIX 3640 3642 FT HELIX 3643 3652 FT STRAND 3655 3660 FT STRAND 3663 3668 FT HELIX 3679 3699 FT HELIX 3703 3705 FT HELIX 3706 3709 FT STRAND 3710 3715 FT STRAND 3718 3724 FT HELIX 3725 3741 FT STRAND 3756 3759 FT HELIX 3760 3762 FT STRAND 3768 3775 FT STRAND 3780 3785 FT HELIX 3788 3797 FT STRAND 3802 3804 FT STRAND 3808 3810 FT HELIX 3811 3822 FT HELIX 3826 3837 FT STRAND 3846 3854 FT HELIX 3856 3864 FT HELIX 3868 3870 FT STRAND 3871 3874 FT HELIX 3876 3882 FT HELIX 3885 3888 FT HELIX 3896 3908 FT STRAND 3910 3912 FT TURN 3915 3918 FT HELIX 3920 3926 FT STRAND 3935 3939 SQ SEQUENCE 3988 AA; 449163 MW; 4474212F338661B8 CRC64; MELITNELLY KTYKQKPVGV EEPVYDQAGD PLFGERGAVH PQSTLKLPHK RGERDVPTNL ASLPKRGDCR SGNSRGPVSG IYLKPGPLFY QDYKGPVYHR APLELFEEGS MCETTKRIGR VTGSDGKLYH IYVCIDGCII IKSATRSYQR VFRWVHNRLD CPLWVTTCSD TKEEGATKKK TQKPDRLERG KMKIVPKESE KDSKTKPPDA TIVVEGVKYQ VRKKGKTKSK NTQDGLYHNK NKPQESRKKL EKALLAWAII AIVLFQVTMG ENITQWNLQD NGTEGIQRAM FQRGVNRSLH GIWPEKICTG VPSHLATDIE LKTIHGMMDA SEKTNYTCCR LQRHEWNKHG WCNWYNIEPW ILVMNRTQAN LTEGQPPREC AVTCRYDRAS DLNVVTQARD SPTPLTGCKK GKNFSFAGIL MRGPCNFEIA ASDVLFKEHE RISMFQDTTL YLVDGLTNSL EGARQGTAKL TTWLGKQLGI LGKKLENKSK TWFGAYAASP YCDVDRKIGY IWYTKNCTPA CLPKNTKIVG PGKFGTNAED GKILHEMGGH LSEVLLLSLV VLSDFAPETA SVMYLILHFS IPQSHVDVMD CDKTQLNLTV ELTTAEVIPG SVWNLGKYVC IRPNWWPYET TVVLAFEEVS QVVKLVLRAL RDLTRIWNAA TTTAFLVCLV KIVRGQMVQG ILWLLLITGV QGHLDCKPEF SYAIAKDERI GQLGAEGLTT TWKEYSPGMK LEDTMVIAWC EDGKLMYLQR CTRETRYLAI LHTRALPTSV VFKKLFDGRK QEDVVEMNDN FEFGLCPCDA KPIVRGKFNT TLLNGPAFQM VCPIGWTGTV SCTSFNMDTL ATTVVRTYRR SKPFPHRQGC ITQKNLGEDL HNCILGGNWT CVPGDQLLYK GGSIESCKWC GYQFKESEGL PHYPIGKCKL ENETGYRLVD STSCNREGVA IVPQGTLKCK IGKTTVQVIA MDTKLGPMPC RPYEIISSEG PVEKTACTFN YTKTLKNKYF EPRDSYFQQY MLKGEYQYWF DLEVTDHHRD YFAESILVVV VALLGGRYVL WLLVTYMVLS EQKALGIQYG SGEVVMMGNL LTHNNIEVVT YFLLLYLLLR EESVKKWVLL LYHILVVHPI KSVIVILLMI GDVVKADSGG QEYLGKIDLC FTTVVLIVIG LIIARRDPTI VPLVTIMAAL RVTELTHQPG VDIAVAVMTI TLLMVSYVTD YFRYKKWLQC ILSLVSAVFL IRSLIYLGRI EMPEVTIPNW RPLTLILLYL ISTTIVTRWK VDVAGLLLQC VPILLLVTTL WADFLTLILI LPTYELVKLY YLKTVRTDTE RSWLGGIDYT RVDSIYDVDE SGEGVYLFPS RQKAQGNFSI LLPLIKATLI SCVSSKWQLI YMSYLTLDFM YYMHRKVIEE ISGGTNIISR LVAALIELNW SMEEEESKGL KKFYLLSGRL RNLIIKHKVR NETVASWYGE EEVYGMPKIM TIIKASTLSK SRHCIICTVC EGREWKGGTC PKCGRHGKPI TCGMSLADFE ERHYKRIFIR EGNFEGMCSR CQGKHRRFEM DREPKSARYC AECNRLHPAE EGDFWAESSM LGLKITYFAL MDGKVYDITE WAGCQRVGIS PDTHRVPCHI SFGSRMPFRQ EYNGFVQYTA RGQLFLRNLP VLATKVKMLM VGNLGEEIGN LEHLGWILRG PAVCKKITEH EKCHINILDK LTAFFGIMPR GTTPRAPVRF PTSLLKVRRG LETAWAYTHQ GGISSVDHVT AGKDLLVCDS MGRTRVVCQS NNRLTDETEY GVKTDSGCPD GARCYVLNPE AVNISGSKGA VVHLQKTGGE FTCVTASGTP AFFDLKNLKG WSGLPIFEAS SGRVVGRVKV GKNEESKPTK IMSGIQTVSK NRADLTEMVK KITSMNRGDF KQITLATGAG KTTELPKAVI EEIGRHKRVL VLIPLRAAAE SVYQYMRLKH PSISFNLRIG DMKEGDMATG ITYASYGYFC QMPQPKLRAA MVEYSYIFLD EYHCATPEQL AIIGKIHRFS ESIRVVAMTA TPAGSVTTTG QKHPIEEFIA PEVMKGEDLG SQFLDIAGLK IPVDEMKGNM LVFVPTRNMA VEVAKKLKAK GYNSGYYYSG EDPANLRVVT SQSPYVIVAT NAIESGVTLP DLDTVIDTGL KCEKRVRVSS KIPFIVTGLK RMAVTVGEQA QRRGRVGRVK PGRYYRSQET ATGSKDYHYD LLQAQRYGIE DGINVTKSFR EMNYDWSLYE EDSLLITQLE ILNNLLISED LPAAVKNIMA RTDHPEPIQL AYNSYEVQVP VLFPKIRNGE VTDTYENYSF LNARKLGEDV PVYIYATEDE DLAVDLLGLD WPDPGNQQVV ETGKALKQVT GLSSAENALL VALFGYVGYQ ALSKRHVPMI TDIYTIEDQR LEDTTHLQYA PNAIKTDGTE TELKELASGD VEKIMGAISD YAAGGLEFVK SQAEKIKTAP LFKENAEAAK GYVQKFIDSL IENKEEIIRY GLWGTHTALY KSIAARLGHE TAFATLVLKW LAFGGESVSD HVKQAAVDLV VYYVMNKPSF PGDSETQQEG RRFVASLFIS ALATYTYKTW NYHNLSKVVE PALAYLPYAT SALKMFTPTR LESVVILSTT IYKTYLSIRK GKSDGLLGTG ISAAMEILSQ NPVSVGISVM LGVGAIAAHN AIESSEQKRT LLMKVFVKNF LDQAATDELV KENPEKIIMA LFEAVQTIGN PLRLIYHLYG VYYKGWEAKE LSERTAGRNL FTLIMFEAFE LLGMDSQGKI RNLSGNYILD LIYGLHKQIN RGLKKMVLGW APAPFSCDWT PSDERIRLPT DNYLRVETRC PCGYEMKAFK NVGGKLTKVE ESGPFLCRNR PGRGPVNYRV TKYYDDNLRE IKPVAKLEGQ VEHYYKGVTA KIDYSKGKML LATDKWEVEH GVITRLAKRY TGVGFNGAYL GDEPNHRALV ERDCATITKN TVQFLKMKKG CAFTYDLTIS NLTRLIELVH RNNLEEKEIP TATVTTWLAY TFVNEDVGTI KPVLGERVIP DPVVDINLQP EVQVDTSEVG ITIIGRETLM TTGVTPVLEK VEPDASDNQN SVKIGLDEGN YPGPGIQTHT LTEEIHNRDA RPFIMILGSR NSISNRAKTA RNINLYTGND PREIRDLMAA GRMLVVALRD VDPELSEMVD FKGTFLDREA LEALSLGQPK PKQVTKEAVR NLIEQKKDVE IPNWFASDDP VFLEVALKND KYYLVGDVGE LKDQAKALGA TDQTRIIKEV GSRTYAMKLS SWFLKASNKQ MSLTPLFEEL LLRCPPATKS NKGHMASAYQ LAQGNWEPLG CGVHLGTIPA RRVKIHPYEA YLKLKDFIEE EEKKPRVKDT VIREHNKWIL KKIRFQGNLN TKKMLNPGKL SEQLDREGRK RNIYNHQIGT IMSSAGIRLE KLPIVRAQTD TKTFHEAIRD KIDKSENRQN PELHNKLLEI FHTIAQPTLK HTYGEVTWEQ LEAGVNRKGA AGFLEKKNIG EVLDSEKHLV EQLVRDLKAG RKIKYYETAI PKNEKRDVSD DWQAGDLVVE KRPRVIQYPE AKTRLAITKV MYNWVKQQPV VIPGYEGKTP LFNIFDKVRK EWDSFNEPVA VSFDTKAWDT QVTSKDLQLI GEIQKYYYKK EWHKFIDTIT DHMTEVPVIT ADGEVYIRNG QRGSGQPDTS AGNSMLNVLT MMYGFCESTG VPYKSFNRVA RIHVCGDDGF LITEKGLGLK FANKGMQILH EAGKPQKITE GEKMKVAYRF EDIEFCSHTP VPVRWSDNTS SHMAGRDTAV ILSKMATRLD SSGERGTTAY EKAVAFSFLL MYSWNPLVRR ICLLVLSQQP ETDPSKHATY YYKGDPIGAY KDVIGRNLSE LKRTGFEKLA NLNLSLSTLG VWTKHTSKRI IQDCVAIGKE EGNWLVKPDR LISSKTGHLY IPDKGFTLQG KHYEQLQLRT ETNPVMGVGT ERYKLGPIVN LLLRRLKILL MTAVGVSS //