ID SET1A_HUMAN Reviewed; 1707 AA. AC O15047; A6NP62; Q6PIF3; Q8TAJ6; DT 21-JUN-2005, integrated into UniProtKB/Swiss-Prot. DT 21-JUN-2005, sequence version 3. DT 19-MAR-2014, entry version 121. DE RecName: Full=Histone-lysine N-methyltransferase SETD1A; DE EC=2.1.1.43; DE AltName: Full=Lysine N-methyltransferase 2F; DE AltName: Full=SET domain-containing protein 1A; DE Short=hSET1A; DE AltName: Full=Set1/Ash2 histone methyltransferase complex subunit SET1; GN Name=SETD1A; Synonyms=KIAA0339, KMT2F, SET1, SET1A; OS Homo sapiens (Human). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; OC Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; OC Catarrhini; Hominidae; Homo. OX NCBI_TaxID=9606; RN [1] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]. RC TISSUE=Brain; RX PubMed=9205841; DOI=10.1093/dnares/4.2.141; RA Nagase T., Ishikawa K., Nakajima D., Ohira M., Seki N., Miyajima N., RA Tanaka A., Kotani H., Nomura N., Ohara O.; RT "Prediction of the coding sequences of unidentified human genes. VII. RT The complete sequences of 100 new cDNA clones from brain which can RT code for large proteins in vitro."; RL DNA Res. 4:141-150(1997). RN [2] RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. RX PubMed=15616553; DOI=10.1038/nature03187; RA Martin J., Han C., Gordon L.A., Terry A., Prabhakar S., She X., RA Xie G., Hellsten U., Chan Y.M., Altherr M., Couronne O., Aerts A., RA Bajorek E., Black S., Blumer H., Branscomb E., Brown N.C., Bruno W.J., RA Buckingham J.M., Callen D.F., Campbell C.S., Campbell M.L., RA Campbell E.W., Caoile C., Challacombe J.F., Chasteen L.A., RA Chertkov O., Chi H.C., Christensen M., Clark L.M., Cohn J.D., RA Denys M., Detter J.C., Dickson M., Dimitrijevic-Bussod M., Escobar J., RA Fawcett J.J., Flowers D., Fotopulos D., Glavina T., Gomez M., RA Gonzales E., Goodstein D., Goodwin L.A., Grady D.L., Grigoriev I., RA Groza M., Hammon N., Hawkins T., Haydu L., Hildebrand C.E., Huang W., RA Israni S., Jett J., Jewett P.B., Kadner K., Kimball H., Kobayashi A., RA Krawczyk M.-C., Leyba T., Longmire J.L., Lopez F., Lou Y., Lowry S., RA Ludeman T., Manohar C.F., Mark G.A., McMurray K.L., Meincke L.J., RA Morgan J., Moyzis R.K., Mundt M.O., Munk A.C., Nandkeshwar R.D., RA Pitluck S., Pollard M., Predki P., Parson-Quintana B., Ramirez L., RA Rash S., Retterer J., Ricke D.O., Robinson D.L., Rodriguez A., RA Salamov A., Saunders E.H., Scott D., Shough T., Stallings R.L., RA Stalvey M., Sutherland R.D., Tapia R., Tesmer J.G., Thayer N., RA Thompson L.S., Tice H., Torney D.C., Tran-Gyamfi M., Tsai M., RA Ulanovsky L.E., Ustaszewska A., Vo N., White P.S., Williams A.L., RA Wills P.L., Wu J.-R., Wu K., Yang J., DeJong P., Bruce D., RA Doggett N.A., Deaven L., Schmutz J., Grimwood J., Richardson P., RA Rokhsar D.S., Eichler E.E., Gilna P., Lucas S.M., Myers R.M., RA Rubin E.M., Pennacchio L.A.; RT "The sequence and analysis of duplication-rich human chromosome 16."; RL Nature 432:988-994(2004). RN [3] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 1-248 AND 1239-1707. RC TISSUE=Brain, and Duodenum; RX PubMed=15489334; DOI=10.1101/gr.2596504; RG The MGC Project Team; RT "The status, quality, and expansion of the NIH full-length cDNA RT project: the Mammalian Gene Collection (MGC)."; RL Genome Res. 14:2121-2127(2004). RN [4] RP FUNCTION, AND INTERACTION WITH HCFC1. RX PubMed=12670868; DOI=10.1101/gad.252103; RA Wysocka J., Myers M.P., Laherty C.D., Eisenman R.N., Herr W.; RT "Human Sin3 deacetylase and trithorax-related Set1/Ash2 histone H3-K4 RT methyltransferase are tethered together selectively by the cell- RT proliferation factor HCF-1."; RL Genes Dev. 17:896-911(2003). RN [5] RP IDENTIFICATION IN THE SET1 COMPLEX. RX PubMed=16253997; DOI=10.1074/jbc.M508312200; RA Lee J.-H., Skalnik D.G.; RT "CpG-binding protein (CXXC finger protein 1) is a component of the RT mammalian Set1 histone H3-Lys4 methyltransferase complex, the analogue RT of the yeast Set1/COMPASS complex."; RL J. Biol. Chem. 280:41725-41731(2005). RN [6] RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RC TISSUE=Cervix carcinoma; RX PubMed=17081983; DOI=10.1016/j.cell.2006.09.026; RA Olsen J.V., Blagoev B., Gnad F., Macek B., Kumar C., Mortensen P., RA Mann M.; RT "Global, in vivo, and site-specific phosphorylation dynamics in RT signaling networks."; RL Cell 127:635-648(2006). RN [7] RP SUBCELLULAR LOCATION, AND IDENTIFICATION IN THE SET1 COMPLEX. RX PubMed=17355966; DOI=10.1074/jbc.M609809200; RA Lee J.-H., Tate C.M., You J.-S., Skalnik D.G.; RT "Identification and characterization of the human Set1B histone H3- RT Lys4 methyltransferase complex."; RL J. Biol. Chem. 282:13419-13428(2007). RN [8] RP IDENTIFICATION IN SET1 COMPLEX, AND INTERACTION WITH ASH2L; RBBP5; RP CXXC1; HCFC1; WDR5; WDR82 AND POLR2A. RX PubMed=17998332; DOI=10.1128/MCB.01356-07; RA Lee J.H., Skalnik D.G.; RT "Wdr82 is a C-terminal domain-binding protein that recruits the Setd1A RT Histone H3-Lys4 methyltransferase complex to transcription start sites RT of transcribed human genes."; RL Mol. Cell. Biol. 28:609-618(2008). RN [9] RP IDENTIFICATION IN SET1 COMPLEX. RX PubMed=18838538; DOI=10.1128/MCB.00976-08; RA Wu M., Wang P.F., Lee J.S., Martin-Brown S., Florens L., Washburn M., RA Shilatifard A.; RT "Molecular regulation of H3K4 trimethylation by Wdr82, a component of RT human Set1/COMPASS."; RL Mol. Cell. Biol. 28:7337-7344(2008). RN [10] RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RC TISSUE=Cervix carcinoma; RX PubMed=18669648; DOI=10.1073/pnas.0805139105; RA Dephoure N., Zhou C., Villen J., Beausoleil S.A., Bakalarski C.E., RA Elledge S.J., Gygi S.P.; RT "A quantitative atlas of mitotic phosphorylation."; RL Proc. Natl. Acad. Sci. U.S.A. 105:10762-10767(2008). RN [11] RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RX PubMed=19413330; DOI=10.1021/ac9004309; RA Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., RA Mohammed S.; RT "Lys-N and trypsin cover complementary parts of the phosphoproteome in RT a refined SCX-based approach."; RL Anal. Chem. 81:4493-4501(2009). RN [12] RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-508, AND IDENTIFICATION RP BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RC TISSUE=Leukemic T-cell; RX PubMed=19690332; DOI=10.1126/scisignal.2000007; RA Mayya V., Lundgren D.H., Hwang S.-I., Rezaul K., Wu L., Eng J.K., RA Rodionov V., Han D.K.; RT "Quantitative phosphoproteomic analysis of T cell receptor signaling RT reveals system-wide modulation of protein-protein interactions."; RL Sci. Signal. 2:RA46-RA46(2009). RN [13] RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RX PubMed=21269460; DOI=10.1186/1752-0509-5-17; RA Burkard T.R., Planyavsky M., Kaupe I., Breitwieser F.P., RA Buerckstuemmer T., Bennett K.L., Superti-Furga G., Colinge J.; RT "Initial characterization of the human central proteome."; RL BMC Syst. Biol. 5:17-17(2011). RN [14] RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]. RX PubMed=21406692; DOI=10.1126/scisignal.2001570; RA Rigbolt K.T., Prokhorova T.A., Akimov V., Henningsen J., RA Johansen P.T., Kratchmarova I., Kassem M., Mann M., Olsen J.V., RA Blagoev B.; RT "System-wide temporal characterization of the proteome and RT phosphoproteome of human embryonic stem cell differentiation."; RL Sci. Signal. 4:RS3-RS3(2011). RN [15] RP INTERACTION WITH ZNF335. RX PubMed=23178126; DOI=10.1016/j.cell.2012.10.043; RA Yang Y.J., Baltus A.E., Mathew R.S., Murphy E.A., Evrony G.D., RA Gonzalez D.M., Wang E.P., Marshall-Walker C.A., Barry B.J., Murn J., RA Tatarakis A., Mahajan M.A., Samuels H.H., Shi Y., Golden J.A., RA Mahajnah M., Shenhav R., Walsh C.A.; RT "Microcephaly gene links trithorax and REST/NRSF to control neural RT stem cell proliferation and differentiation."; RL Cell 151:1097-1112(2012). RN [16] RP INTERACTION WITH SUPT6H. RX PubMed=22843687; DOI=10.1074/jbc.M112.351569; RA Begum N.A., Stanlie A., Nakata M., Akiyama H., Honjo T.; RT "The histone chaperone Spt6 is required for activation-induced RT cytidine deaminase target determination through H3K4me3 regulation."; RL J. Biol. Chem. 287:32415-32429(2012). CC -!- FUNCTION: Histone methyltransferase that specifically methylates CC 'Lys-4' of histone H3, when part of the SET1 histone CC methyltransferase (HMT) complex, but not if the neighboring 'Lys- CC 9' residue is already methylated. H3 'Lys-4' methylation CC represents a specific tag for epigenetic transcriptional CC activation. The non-overalpping localization with SETD1B suggests CC that SETD1A and SETD1B make non-redundant contributions to the CC epigenetic control of chromatin structure and gene expression. CC -!- CATALYTIC ACTIVITY: S-adenosyl-L-methionine + L-lysine-[histone] = CC S-adenosyl-L-homocysteine + N(6)-methyl-L-lysine-[histone]. CC -!- SUBUNIT: Component of the SET1 complex, at least composed of the CC catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, CC ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30. Interacts with HCFC1. CC Interacts with ASH2/ASH2L, CXXC1/CFP1, WDR5 and RBBP5. Interacts CC (via the RRM domain) with WDR82. Interacts (via the RRM domain) CC with hyperphosphorylated C-terminal domain (CTD) of RNA polymerase CC II large subunit (POLR2A) only in the presence of WDR82. Binds CC specifically to CTD heptad repeats phosphorylated on 'Ser-5' of CC each heptad. Interacts with ZNF335. Interacts with SUPT6H. CC -!- INTERACTION: CC P51610:HCFC1; NbExp=2; IntAct=EBI-540779, EBI-396176; CC -!- SUBCELLULAR LOCATION: Nucleus speckle. Chromosome. Note=Localizes CC to a largely non-overlapping set of euchromatic nuclear speckles CC with SETD1B, suggesting that SETD1A and SETD1B each bind to a CC unique set of target genes. CC -!- SIMILARITY: Belongs to the class V-like SAM-binding CC methyltransferase superfamily. CC -!- SIMILARITY: Contains 1 post-SET domain. CC -!- SIMILARITY: Contains 1 RRM (RNA recognition motif) domain. CC -!- SIMILARITY: Contains 1 SET domain. CC -!- SEQUENCE CAUTION: CC Sequence=AAH35795.1; Type=Miscellaneous discrepancy; Note=Contaminating sequence. Potential poly-A sequence; CC Sequence=BAA20797.2; Type=Erroneous initiation; Note=Translation N-terminally shortened; CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; AB002337; BAA20797.2; ALT_INIT; mRNA. DR EMBL; AC135048; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR EMBL; BC027450; AAH27450.1; -; mRNA. DR EMBL; BC035795; AAH35795.1; ALT_SEQ; mRNA. DR RefSeq; NP_055527.1; NM_014712.1. DR RefSeq; XP_005255780.1; XM_005255723.1. DR UniGene; Hs.297483; -. DR PDB; 3S8S; X-ray; 1.30 A; A=89-197. DR PDB; 3UVN; X-ray; 1.79 A; B/D=1492-1502. DR PDB; 4EWR; X-ray; 1.50 A; C=1488-1501. DR PDBsum; 3S8S; -. DR PDBsum; 3UVN; -. DR PDBsum; 4EWR; -. DR ProteinModelPortal; O15047; -. DR SMR; O15047; 89-195. DR BioGrid; 115088; 18. DR DIP; DIP-33494N; -. DR IntAct; O15047; 5. DR STRING; 9606.ENSP00000262519; -. DR PhosphoSite; O15047; -. DR PaxDb; O15047; -. DR PRIDE; O15047; -. DR DNASU; 9739; -. DR Ensembl; ENST00000262519; ENSP00000262519; ENSG00000099381. DR GeneID; 9739; -. DR KEGG; hsa:9739; -. DR UCSC; uc002ead.1; human. DR CTD; 9739; -. DR GeneCards; GC16P030968; -. DR HGNC; HGNC:29010; SETD1A. DR HPA; HPA020646; -. DR MIM; 611052; gene. DR neXtProt; NX_O15047; -. DR PharmGKB; PA128394556; -. DR eggNOG; COG2940; -. DR HOGENOM; HOG000154291; -. DR HOVERGEN; HBG067119; -. DR InParanoid; O15047; -. DR KO; K11422; -. DR OMA; GYLRLTY; -. DR OrthoDB; EOG7GQXTT; -. DR TreeFam; TF106436; -. DR BRENDA; 2.1.1.43; 2681. DR ChiTaRS; SETD1A; human. DR GenomeRNAi; 9739; -. DR NextBio; 36651; -. DR PRO; PR:O15047; -. DR ArrayExpress; O15047; -. DR Bgee; O15047; -. DR CleanEx; HS_SETD1A; -. DR Genevestigator; O15047; -. DR GO; GO:0005694; C:chromosome; IEA:UniProtKB-SubCell. DR GO; GO:0016607; C:nuclear speck; IEA:UniProtKB-SubCell. DR GO; GO:0048188; C:Set1C/COMPASS complex; IDA:UniProtKB. DR GO; GO:0042800; F:histone methyltransferase activity (H3-K4 specific); IDA:UniProtKB. DR GO; GO:0000166; F:nucleotide binding; IEA:InterPro. DR GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW. DR GO; GO:0006355; P:regulation of transcription, DNA-templated; IEA:UniProtKB-KW. DR GO; GO:0006351; P:transcription, DNA-templated; IEA:UniProtKB-KW. DR Gene3D; 3.30.70.330; -; 1. DR InterPro; IPR024657; COMPASS_Set1_N-SET. DR InterPro; IPR015722; Histone-lysine_MeTfrase. DR InterPro; IPR012677; Nucleotide-bd_a/b_plait. DR InterPro; IPR003616; Post-SET_dom. DR InterPro; IPR000504; RRM_dom. DR InterPro; IPR001214; SET_dom. DR PANTHER; PTHR22884:SF10; PTHR22884:SF10; 1. DR Pfam; PF11764; N-SET; 1. DR Pfam; PF00076; RRM_1; 1. DR Pfam; PF00856; SET; 1. DR SMART; SM00508; PostSET; 1. DR SMART; SM00360; RRM; 1. DR SMART; SM00317; SET; 1. DR PROSITE; PS50868; POST_SET; 1. DR PROSITE; PS50102; RRM; 1. DR PROSITE; PS50280; SET; 1. PE 1: Evidence at protein level; KW 3D-structure; Activator; Chromatin regulator; Chromosome; KW Complete proteome; Methyltransferase; Nucleus; Phosphoprotein; KW Polymorphism; Reference proteome; RNA-binding; KW S-adenosyl-L-methionine; Transcription; Transcription regulation; KW Transferase. FT CHAIN 1 1707 Histone-lysine N-methyltransferase FT SETD1A. FT /FTId=PRO_0000186056. FT DOMAIN 84 172 RRM. FT DOMAIN 1568 1685 SET. FT DOMAIN 1691 1707 Post-SET. FT REGION 1415 1450 Interaction with CFP1. FT REGION 1450 1537 Interaction with ASH2L, RBBP5 and WDR5. FT MOTIF 1299 1303 HCFC1-binding motif (HBM). FT COMPBIAS 244 362 Ser-rich. FT COMPBIAS 383 654 Pro-rich. FT COMPBIAS 899 1010 Glu-rich. FT COMPBIAS 1011 1062 Ser-rich. FT COMPBIAS 1071 1194 Pro-rich. FT COMPBIAS 1334 1375 Glu-rich. FT COMPBIAS 1403 1417 Pro-rich. FT MOD_RES 508 508 Phosphoserine. FT VARIANT 639 639 D -> N (in dbSNP:rs897985). FT /FTId=VAR_059318. FT CONFLICT 242 248 PCSQDTS -> ACPVTHV (in Ref. 3; AAH35795). FT CONFLICT 1240 1242 TEE -> FLG (in Ref. 3; AAH27450). FT STRAND 95 100 FT HELIX 107 114 FT TURN 115 117 FT STRAND 120 127 FT TURN 129 131 FT STRAND 134 144 FT HELIX 145 155 FT STRAND 166 169 FT HELIX 174 184 FT TURN 190 192 FT HELIX 1494 1497 SQ SEQUENCE 1707 AA; 186034 MW; 0084217B0D425050 CRC64; MDQEGGGDGQ KAPSFQWRNY KLIVDPALDP ALRRPSQKVY RYDGVHFSVN DSKYIPVEDL QDPRCHVRSK NRDFSLPVPK FKLDEFYIGQ IPLKEVTFAR LNDNVRETFL KDMCRKYGEV EEVEILLHPR TRKHLGLARV LFTSTRGAKE TVKNLHLTSV MGNIIHAQLD IKGQQRMKYY ELIVNGSYTP QTVPTGGKAL SEKFQGSGAA TETAESRRRS SSDTAAYPAG TTAVGTPGNG TPCSQDTSFS SSRQDTPSSF GQFTPQSSQG TPYTSRGSTP YSQDSAYSSS TTSTSFKPRR SENSYQDAFS RRHFSASSAS TTASTAIAAT TAATASSSAS SSSLSSSSSS SSSSSSSQFR SSDANYPAYY ESWNRYQRHT SYPPRRATRE EPPGAPFAEN TAERFPPSYT SYLPPEPSRP TDQDYRPPAS EAPPPEPPEP GGGGGGGGPS PEREEVRTSP RPASPARSGS PAPETTNESV PFAQHSSLDS RIEMLLKEQR SKFSFLASDT EEEEENSSMV LGARDTGSEV PSGSGHGPCT PPPAPANFED VAPTGSGEPG ATRESPKANG QNQASPCSSG DDMEISDDDR GGSPPPAPTP PQQPPPPPPP PPPPPPYLAS LPLGYPPHQP AYLLPPRPDG PPPPEYPPPP PPPPHIYDFV NSLELMDRLG AQWGGMPMSF QMQTQMLTRL HQLRQGKGLI AASAGPPGGA FGEAFLPFPP PQEAAYGLPY ALYAQGQEGR GAYSREAYHL PMPMAAEPLP SSSVSGEEAR LPPREEAELA EGKTLPTAGT VGRVLAMLVQ EMKSIMQRDL NRKMVENVAF GAFDQWWESK EEKAKPFQNA AKQQAKEEDK EKTKLKEPGL LSLVDWAKSG GTTGIEAFAF GSGLRGALRL PSFKVKRKEP SEISEASEEK RPRPSTPAEE DEDDPEQEKE AGEPGRPGTK PPKRDEERGK TQGKHRKSFA LDSEGEEASQ ESSSEKDEED DEEDEEDEDR EEAVDTTKKE TEVSDGEDEE SDSSSKCSLY ADSDGENDST SDSESSSSSS SSSSSSSSSS SSSSSSSSES SSEDEEEEER PAALPSASPP PREVPVPTPA PVEVPVPERV AGSPVTPLPE QEASPARPAG PTEESPPSAP LRPPEPPAGP PAPAPRPDER PSSPIPLLPP PKKRRKTVSF SAIEVVPAPE PPPATPPQAK FPGPASRKAP RGVERTIRNL PLDHASLVKS WPEEVSRGGR SRAGGRGRLT EEEEAEPGTE VDLAVLADLA LTPARRGLPA LPAVEDSEAT ETSDEAERPR PLLSHILLEH NYALAVKPTP PAPALRPPEP VPAPAALFSS PADEVLEAPE VVVAEAEEPK PQQLQQQREE GEEEGEEEGE EEEEESSDSS SSSDGEGALR RRSLRSHARR RRPPPPPPPP PPRAYEPRSE FEQMTILYDI WNSGLDSEDM SYLRLTYERL LQQTSGADWL NDTHWVHHTI TNLTTPKRKR RPQDGPREHQ TGSARSEGYY PISKKEKDKY LDVCPVSARQ LEGVDTQGTN RVLSERRSEQ RRLLSAIGTS AIMDSDLLKL NQLKFRKKKL RFGRSRIHEW GLFAMEPIAA DEMVIEYVGQ NIRQMVADMR EKRYVQEGIG SSYLFRVDHD TIIDATKCGN LARFINHCCT PNCYAKVITI ESQKKIVIYS KQPIGVDEEI TYDYKFPLED NKIPCLCGTE SCRGSLN //