ID ZFHX2_HUMAN Reviewed; 2572 AA. AC Q9C0A1; Q9UPU6; DT 12-FEB-2003, integrated into UniProtKB/Swiss-Prot. DT 13-JUL-2010, sequence version 3. DT 05-OCT-2010, entry version 76. DE RecName: Full=Zinc finger homeobox protein 2; DE AltName: Full=Zinc finger homeodomain protein 2; DE Short=ZFH-2; GN Name=ZFHX2; Synonyms=KIAA1056, KIAA1762, ZNF409; OS Homo sapiens (Human). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; OC Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; OC Catarrhini; Hominidae; Homo. OX NCBI_TaxID=9606; RN [1] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2), AND NUCLEOTIDE RP SEQUENCE [LARGE SCALE MRNA] OF 1113-2572 (ISOFORM 1). RC TISSUE=Brain; RX MEDLINE=21082932; PubMed=11214970; DOI=10.1093/dnares/7.6.347; RA Nagase T., Kikuno R., Hattori A., Kondo Y., Okumura K., Ohara O.; RT "Prediction of the coding sequences of unidentified human genes. XIX. RT The complete sequences of 100 new cDNA clones from brain which code RT for large proteins in vitro."; RL DNA Res. 7:347-355(2000). RN [2] RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. RX MEDLINE=22459283; PubMed=12508121; DOI=10.1038/nature01348; RA Heilig R., Eckenberg R., Petit J.-L., Fonknechten N., Da Silva C., RA Cattolico L., Levy M., Barbe V., De Berardinis V., Ureta-Vidal A., RA Pelletier E., Vico V., Anthouard V., Rowen L., Madan A., Qin S., RA Sun H., Du H., Pepin K., Artiguenave F., Robert C., Cruaud C., RA Bruels T., Jaillon O., Friedlander L., Samson G., Brottier P., RA Cure S., Segurens B., Aniere F., Samain S., Crespeau H., Abbasi N., RA Aiach N., Boscus D., Dickhoff R., Dors M., Dubois I., Friedman C., RA Gouyvenoux M., James R., Madan A., Mairey-Estrada B., Mangenot S., RA Martins N., Menard M., Oztas S., Ratcliffe A., Shaffer T., Trask B., RA Vacherie B., Bellemere C., Belser C., Besnard-Gonnet M., RA Bartol-Mavel D., Boutard M., Briez-Silla S., Combette S., RA Dufosse-Laurent V., Ferron C., Lechaplais C., Louesse C., Muselet D., RA Magdelenat G., Pateau E., Petit E., Sirvain-Trukniewicz P., Trybou A., RA Vega-Czarny N., Bataille E., Bluet E., Bordelais I., Dubois M., RA Dumont C., Guerin T., Haffray S., Hammadi R., Muanga J., Pellouin V., RA Robert D., Wunderle E., Gauguet G., Roy A., Sainte-Marthe L., RA Verdier J., Verdier-Discala C., Hillier L.W., Fulton L., McPherson J., RA Matsuda F., Wilson R., Scarpelli C., Gyapay G., Wincker P., Saurin W., RA Quetier F., Waterston R., Hood L., Weissenbach J.; RT "The DNA sequence and analysis of human chromosome 14."; RL Nature 421:601-607(2003). CC -!- FUNCTION: May be involved in transcriptional regulation. CC -!- SUBCELLULAR LOCATION: Nucleus (Probable). CC -!- ALTERNATIVE PRODUCTS: CC Event=Alternative splicing; Named isoforms=2; CC Name=1; CC IsoId=Q9C0A1-1; Sequence=Displayed; CC Name=2; CC IsoId=Q9C0A1-2; Sequence=VSP_039496, VSP_039497; CC -!- SIMILARITY: Contains 13 C2H2-type zinc fingers. CC -!- SIMILARITY: Contains 3 homeobox DNA-binding domains. CC -!- SEQUENCE CAUTION: CC Sequence=BAA83008.2; Type=Erroneous initiation; Note=Translation N-terminally shortened; CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; AB028979; BAA83008.2; ALT_INIT; mRNA. DR EMBL; AB051549; BAB21853.1; -; mRNA. DR EMBL; AL132855; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR EMBL; AL135999; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR IPI; IPI00007109; -. DR IPI; IPI00926149; -. DR UniGene; Hs.508937; -. DR UniGene; Hs.525247; -. DR ProteinModelPortal; Q9C0A1; -. DR SMR; Q9C0A1; 42-134, 448-513, 708-769, 903-979, 984-1041. DR PhosphoSite; Q9UPU6; -. DR PRIDE; Q9C0A1; -. DR Ensembl; ENST00000258869; ENSP00000258869; ENSG00000136367. DR Ensembl; ENST00000382785; ENSP00000372235; ENSG00000136367. DR Ensembl; ENST00000419474; ENSP00000413418; ENSG00000136367. DR UCSC; uc010akq.1; human. DR GeneCards; GC14M023990; -. DR HGNC; HGNC:20152; ZFHX2. DR HPA; HPA000720; -. DR HPA; HPA005146; -. DR PharmGKB; PA134951324; -. DR eggNOG; prNOG19761; -. DR HOGENOM; HBG505562; -. DR HOVERGEN; HBG063719; -. DR InParanoid; Q9C0A1; -. DR OrthoDB; EOG9B0187; -. DR PhylomeDB; Q9C0A1; -. DR ArrayExpress; Q9C0A1; -. DR Bgee; Q9C0A1; -. DR CleanEx; HS_ZFHX2; -. DR Genevestigator; Q9C0A1; -. DR GermOnline; ENSG00000136367; Homo sapiens. DR GO; GO:0005737; C:cytoplasm; IDA:HPA. DR GO; GO:0005634; C:nucleus; IDA:HPA. DR GO; GO:0043565; F:sequence-specific DNA binding; IEA:InterPro. DR GO; GO:0003700; F:transcription factor activity; IEA:InterPro. DR GO; GO:0008270; F:zinc ion binding; IEA:InterPro. DR GO; GO:0006355; P:regulation of transcription, DNA-dependent; IEA:InterPro. DR GO; GO:0006350; P:transcription; IEA:UniProtKB-KW. DR InterPro; IPR001356; Homeobox. DR InterPro; IPR017970; Homeobox_CS. DR InterPro; IPR009057; Homeodomain-like. DR InterPro; IPR012287; Homeodomain-rel. DR InterPro; IPR007087; Znf_C2H2. DR InterPro; IPR015880; Znf_C2H2-like. DR InterPro; IPR013087; Znf_C2H2/integrase_DNA-bd. DR InterPro; IPR003604; Znf_U1. DR Gene3D; G3DSA:1.10.10.60; Homeodomain-rel; 3. DR Gene3D; G3DSA:3.30.160.60; Znf_C2H2/integrase_DNA-bd; 2. DR Pfam; PF00046; Homeobox; 3. DR SMART; SM00389; HOX; 3. DR SMART; SM00355; ZnF_C2H2; 15. DR SMART; SM00451; ZnF_U1; 7. DR SUPFAM; SSF46689; Homeodomain_like; 3. DR PROSITE; PS00027; HOMEOBOX_1; 1. DR PROSITE; PS50071; HOMEOBOX_2; 3. DR PROSITE; PS00028; ZINC_FINGER_C2H2_1; 9. DR PROSITE; PS50157; ZINC_FINGER_C2H2_2; 5. PE 2: Evidence at transcript level; KW Activator; Alternative splicing; Complete proteome; DNA-binding; KW Homeobox; Metal-binding; Nucleus; Repeat; Repressor; Transcription; KW Transcription regulation; Zinc; Zinc-finger. FT CHAIN 1 2572 Zinc finger homeobox protein 2. FT /FTId=PRO_0000047243. FT ZN_FING 821 845 C2H2-type 4. FT ZN_FING 870 894 C2H2-type 5. FT ZN_FING 1009 1032 C2H2-type 6. FT ZN_FING 1191 1217 C2H2-type 7. FT ZN_FING 1248 1272 C2H2-type 8. FT ZN_FING 1480 1503 C2H2-type 9. FT DNA_BIND 1595 1654 Homeobox 1. FT ZN_FING 1670 1696 C2H2-type 10; degenerate. FT ZN_FING 1769 1791 C2H2-type 11. FT DNA_BIND 1857 1916 Homeobox 2. FT DNA_BIND 2065 2124 Homeobox 3. FT ZN_FING 2451 2472 C2H2-type 12; degenerate. FT ZN_FING 2495 2519 C2H2-type 13. FT COMPBIAS 605 710 Pro-rich. FT COMPBIAS 1061 1144 Pro-rich. FT COMPBIAS 1321 1471 Pro-rich. FT COMPBIAS 1699 1759 Glu-rich. FT COMPBIAS 1921 2018 Pro-rich. FT COMPBIAS 2193 2424 Pro-rich. FT VAR_SEQ 854 862 FLLDMEGAE -> RTETGLLIK (in isoform 2). FT /FTId=VSP_039496. FT VAR_SEQ 863 2572 Missing (in isoform 2). FT /FTId=VSP_039497. FT CONFLICT 473 473 R -> Q (in Ref. 1; BAA83008). FT CONFLICT 550 550 P -> T (in Ref. 1; BAA83008). SQ SEQUENCE 2572 AA; 274176 MW; 239C8050F65B25C2 CRC64; MATLNSASTT GTTPSPGHNA PSLPSDTFSS STPSDPVTKD PPAASSTSEN MRSSEPGGQL LESGCGLVPP KEIGEPQEGP DCGHFPPNDP GVEKDKEQEE EEEGLPPMDL SNHLFFTAGG EAYLVAKLSL PGGSELLLPK GFPWGEAGIK EEPSLPFLAY PPPSHLTALH IQHGFDPIQG FSSSDQILSH DTSAPSPAAC EERHGAFWSY QLAPNPPGDP KDGPMGNSGG NHVAVFWLCL LCRLGFSKPQ AFMDHTQSHG VKLTPAQYQG LSGSPAVLQE GDEGCKALIS FLEPKLPARP SSDIPLDNSS TVNMEANVAQ TEDGPPEAEV QALILLDEEV MALSPPSPPT ATWDPSPTQA KESPVAAGEA GPDWFPEGQE EDGGLCPPLN QSSPTSKEGG TLPAPVGSPE DPSDPPQPYR LADDYTPAPA AFQGLSLSSH MSLLHSRNSC KTLKCPKCNW HYKYQQTLDV HMREKHPESN SHCSYCSAGG AHPRLARGES YNCGYKPYRC DVCNYSTTTK GNLSIHMQSD KHLANLQGFQ AGPGGQGSPP EASLPPSAGD KEPKTKSSWQ CKVCSYETNI SRNLRIHMTS EKHMQNVLML HQGLPLGLPP GLMGPGPPPP PGATPTSPPE LFQYFGPQAL GQPQTPLAGP GLRPDKPLEA QLLLNGFHHV GAPARKFPTS APGSLSPDAH LPPSQLLGSS SDSLPTSPPP DDSLSLKVFR CLVCQAFSTD SLELLLYHCS IGRSLPEAEW KEVAGDTHRC KLCCYGTQLK ANFQLHLKTD KHAQKYQLAA HLREGGGAMG TPSPASLGDG APYGSVSPLH LRCNICDFES NSKEKMQLHA RGAAHEENSQ IYKFLLDMEG AEAGAELGLY HCLLCAWETP SRLAVLQHLR TPAHRDAQAQ RRLQLLQNGP TTEEGLAALQ SILSFSHGQL RTPGKAPVTP LAEPPTPEKD AQNKTEQLAS EETENKTGPS RDSANQTTVY CCPYCSFLSP ESSQVRAHTL SQHAVQPKYR CPLCQEQLVG RPALHFHLSH LHNVVPECVE KLLLVATTVE MTFTTKVLSA PTLSPLDNGQ EPPTHGPEPT PSRDQAAEGP NLTPEASPDP LPEPPLASVE VPDKPSGSPG QPPSPAPSPV PEPDAQAEDV APPPTMAEEE EGTTGELRSA EPAPADSRHP LTYRKTTNFA LDKFLDPARP YKCTVCKESF TQKNILLVHY NSVSHLHKMK KAAIDPSAPA RGEAGAPPTT TAATDKPFKC TVCRVSYNQS STLEIHMRSV LHQTRSRGTK TDSKIEGPER SQEEPKEGET EGEVGTEKKG PDTSGFISGL PFLSPPPPPL DLHRFPAPLF TPPVLPPFPL VPESLLKLQQ QQLLLPFYLH DLKVGPKLTL AGPAPVLSLP AATPPPPPQP PKAELAEREW ERPPMAKEGN EAGPSSPPDP LPNEAARTAA KALLENFGFE LVIQYNEGKQ AVPPPPTPPP PEALGGGDKL ACGACGKLFS NMLILKTHEE HVHRRFLPFE ALSRYAAQFR KSYDSLYPPL AEPPKPPDGS LDSPVPHLGP PFLVPEPEAG GTRAPEERSR AGGHWPIEEE ESSRGNLPPL VPAGRRFSRT KFTEFQTQAL QSFFETSAYP KDGEVERLAS LLGLASRVVV VWFQNARQKA RKNACEGGSM PTGGGTGGAS GCRRCHATFS CVFELVRHLK KCYDDQTLEE EEEEAERGEE EEEVEEEEVE EEQGLEPPAG PEGPLPEPPD GEELSQAEAT KAGGKEPEEK ATPSPSPAHT CDQCAISFSS QDLLTSHRRL HFLPSLQPSA PPQLLDLPLL VFGERNPLVA ATSPMPGPPL KRKHEDGSLS PTGSEAGGGG EGEPPRDKRL RTTILPEQLE ILYRWYMQDS NPTRKMLDCI SEEVGLKKRV VQVWFQNTRA RERKGQFRST PGGVPSPAVK PPATATPASL PKFNLLLGKV DDGTGREAPK REAPAFPYPT ATLASGPQPF LPPGKEATTP TPEPPLPLLP PPPPSEEEGP EEPPKASPES EACSLSAGDL SDSSASSLAE PESPGAGGTS GGPGGGTGVP DGMGQRRYRT QMSSLQLKIM KACYEAYRTP TMQECEVLGE EIGLPKRVIQ VWFQNARAKE KKAKLQGTAA GSTGGSSEGL LAAQRTDCPY CDVKYDFYVS CRGHLFSRQH LAKLKEAVRA QLKSESKCYD LAPAPEAPPA LKAPPATTPA SMPLGAAPTL PRLAPVLLSG PALAQPPLGN LAPFNSGPAA SSGLLGLATS VLPTTTVVQT AGPGRPLPQR PMPDQTNTST AGTTDPVPGP PTEPLGDKVS SERKPVAGPT SSSNDALKNL KALKTTVPAL LGGQFLPFPL PPAGGTAPPA VFGPQLQGAY FQQLYGMKKG LFPMNPMIPQ TLIGLLPNAL LQPPPQPPEP TATAPPKPPE LPAPGEGEAG EVDELLTGST GISTVDVTHR YLCRQCKMAF DGEAPATAHQ RSFCFFGRGS GGSMPPPLRV PICTYHCLAC EVLLSGREAL ASHLRSSAHR RKAAPPQGGP PISITNAATA ASAAVAFAKE EARLPHTDSN PKTTTTSTLL AL //