ID COLA1_HUMAN Reviewed; 957 AA. AC Q96P44; A6NIX5; B2R8J9; Q49A51; Q71RF4; Q8WXV8; Q9H0V3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 01-DEC-2001, sequence version 1. DT 27-MAR-2024, entry version 179. DE RecName: Full=Collagen alpha-1(XXI) chain; DE Flags: Precursor; GN Name=COL21A1; Synonyms=COL1AL; ORFNames=FP633; OS Homo sapiens (Human). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; OC Homo. OX NCBI_TaxID=9606; RN [1] RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1), AND TISSUE SPECIFICITY. RX PubMed=11566190; DOI=10.1016/s0014-5793(01)02754-5; RA Fitzgerald J., Bateman J.F.; RT "A new FACIT of the collagen family: COL21A1."; RL FEBS Lett. 505:275-280(2001). RN [2] RP NUCLEOTIDE SEQUENCE [GENOMIC DNA / MRNA] (ISOFORMS 1 AND 3), DEVELOPMENTAL RP STAGE, TISSUE SPECIFICITY, SUBCELLULAR LOCATION, AND INDUCTION BY PDGF. RX PubMed=11863369; DOI=10.1006/geno.2002.6712; RA Chou M.-Y., Li H.-C.; RT "Genomic organization and characterization of the human type XXI collagen RT (COL21A1) gene."; RL Genomics 79:395-401(2002). RN [3] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1). RC TISSUE=Brain; RX PubMed=11230166; DOI=10.1101/gr.gr1547r; RA Wiemann S., Weil B., Wellenreuther R., Gassenhuber J., Glassl S., RA Ansorge W., Boecher M., Bloecker H., Bauersachs S., Blum H., Lauber J., RA Duesterhoeft A., Beyer A., Koehrer K., Strack N., Mewes H.-W., RA Ottenwaelder B., Obermaier B., Tampe J., Heubner D., Wambutt R., Korn B., RA Klein M., Poustka A.; RT "Towards a catalog of human genes and proteins: sequencing and analysis of RT 500 novel complete protein coding human cDNAs."; RL Genome Res. 11:422-435(2001). RN [4] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2). RX PubMed=15498874; DOI=10.1073/pnas.0404089101; RA Wan D., Gong Y., Qin W., Zhang P., Li J., Wei L., Zhou X., Li H., Qiu X., RA Zhong F., He L., Yu J., Yao G., Jiang H., Qian L., Yu Y., Shu H., Chen X., RA Xu H., Guo M., Pan Z., Chen Y., Ge C., Yang S., Gu J.; RT "Large-scale cDNA transfection screening for genes related to cancer RT development and progression."; RL Proc. Natl. Acad. Sci. U.S.A. 101:15724-15729(2004). RN [5] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1), AND VARIANT MET-343. RC TISSUE=Trachea; RX PubMed=14702039; DOI=10.1038/ng1285; RA Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., RA Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., RA Sekine M., Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S., RA Yamamoto J., Saito K., Kawai Y., Isono Y., Nakamura Y., Nagahari K., RA Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., Sudo H., RA Hosoiri T., Kaku Y., Kodaira H., Kondo H., Sugawara M., Takahashi M., RA Kanda K., Yokoi T., Furuya T., Kikkawa E., Omura Y., Abe K., Kamihara K., RA Katsuta N., Sato K., Tanikawa M., Yamazaki M., Ninomiya K., Ishibashi T., RA Yamashita H., Murakawa K., Fujimori K., Tanai H., Kimata M., Watanabe M., RA Hiraoka S., Chiba Y., Ishida S., Ono Y., Takiguchi S., Watanabe S., RA Yosida M., Hotuta T., Kusano J., Kanehori K., Takahashi-Fujii A., Hara H., RA Tanase T.-O., Nomura Y., Togiya S., Komai F., Hara R., Takeuchi K., RA Arita M., Imose N., Musashino K., Yuuki H., Oshima A., Sasaki N., RA Aotsuka S., Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N., Sano S., RA Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki O., RA Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., RA Hishigaki H., Watanabe T., Sugiyama A., Takemoto M., Kawakami B., RA Yamazaki M., Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., Fujimori Y., RA Komiyama M., Tashiro H., Tanigami A., Fujiwara T., Ono T., Yamada K., RA Fujii Y., Ozaki K., Hirao M., Ohmori Y., Kawabata A., Hikiji T., RA Kobatake N., Inagaki H., Ikema Y., Okamoto S., Okitani R., Kawakami T., RA Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K., Nakajima Y., RA Mizuno T., Morinaga M., Sasaki M., Togashi T., Oyama M., Hata H., RA Watanabe M., Komatsu T., Mizushima-Sugano J., Satoh T., Shirai Y., RA Takahashi Y., Nakagawa K., Okumura K., Nagase T., Nomura N., Kikuchi H., RA Masuho Y., Yamashita R., Nakai K., Yada T., Nakamura Y., Ohara O., RA Isogai T., Sugano S.; RT "Complete sequencing and characterization of 21,243 full-length human RT cDNAs."; RL Nat. Genet. 36:40-45(2004). RN [6] RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. RX PubMed=14574404; DOI=10.1038/nature02055; RA Mungall A.J., Palmer S.A., Sims S.K., Edwards C.A., Ashurst J.L., RA Wilming L., Jones M.C., Horton R., Hunt S.E., Scott C.E., Gilbert J.G.R., RA Clamp M.E., Bethel G., Milne S., Ainscough R., Almeida J.P., Ambrose K.D., RA Andrews T.D., Ashwell R.I.S., Babbage A.K., Bagguley C.L., Bailey J., RA Banerjee R., Barker D.J., Barlow K.F., Bates K., Beare D.M., Beasley H., RA Beasley O., Bird C.P., Blakey S.E., Bray-Allen S., Brook J., Brown A.J., RA Brown J.Y., Burford D.C., Burrill W., Burton J., Carder C., Carter N.P., RA Chapman J.C., Clark S.Y., Clark G., Clee C.M., Clegg S., Cobley V., RA Collier R.E., Collins J.E., Colman L.K., Corby N.R., Coville G.J., RA Culley K.M., Dhami P., Davies J., Dunn M., Earthrowl M.E., Ellington A.E., RA Evans K.A., Faulkner L., Francis M.D., Frankish A., Frankland J., RA French L., Garner P., Garnett J., Ghori M.J., Gilby L.M., Gillson C.J., RA Glithero R.J., Grafham D.V., Grant M., Gribble S., Griffiths C., RA Griffiths M.N.D., Hall R., Halls K.S., Hammond S., Harley J.L., Hart E.A., RA Heath P.D., Heathcott R., Holmes S.J., Howden P.J., Howe K.L., Howell G.R., RA Huckle E., Humphray S.J., Humphries M.D., Hunt A.R., Johnson C.M., RA Joy A.A., Kay M., Keenan S.J., Kimberley A.M., King A., Laird G.K., RA Langford C., Lawlor S., Leongamornlert D.A., Leversha M., Lloyd C.R., RA Lloyd D.M., Loveland J.E., Lovell J., Martin S., Mashreghi-Mohammadi M., RA Maslen G.L., Matthews L., McCann O.T., McLaren S.J., McLay K., McMurray A., RA Moore M.J.F., Mullikin J.C., Niblett D., Nickerson T., Novik K.L., RA Oliver K., Overton-Larty E.K., Parker A., Patel R., Pearce A.V., Peck A.I., RA Phillimore B.J.C.T., Phillips S., Plumb R.W., Porter K.M., Ramsey Y., RA Ranby S.A., Rice C.M., Ross M.T., Searle S.M., Sehra H.K., Sheridan E., RA Skuce C.D., Smith S., Smith M., Spraggon L., Squares S.L., Steward C.A., RA Sycamore N., Tamlyn-Hall G., Tester J., Theaker A.J., Thomas D.W., RA Thorpe A., Tracey A., Tromans A., Tubby B., Wall M., Wallis J.M., RA West A.P., White S.S., Whitehead S.L., Whittaker H., Wild A., Willey D.J., RA Wilmer T.E., Wood J.M., Wray P.W., Wyatt J.C., Young L., Younger R.M., RA Bentley D.R., Coulson A., Durbin R.M., Hubbard T., Sulston J.E., Dunham I., RA Rogers J., Beck S.; RT "The DNA sequence and analysis of human chromosome 6."; RL Nature 425:805-811(2003). RN [7] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1). RC TISSUE=Brain; RX PubMed=15489334; DOI=10.1101/gr.2596504; RG The MGC Project Team; RT "The status, quality, and expansion of the NIH full-length cDNA project: RT the Mammalian Gene Collection (MGC)."; RL Genome Res. 14:2121-2127(2004). CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular CC matrix {ECO:0000269|PubMed:11863369}. Cytoplasm CC {ECO:0000269|PubMed:11863369}. Note=Found in the extracellular matrix CC component of blood vessel walls and in the cytoplasm of cultured human CC aortic smooth muscle. CC -!- ALTERNATIVE PRODUCTS: CC Event=Alternative splicing; Named isoforms=3; CC Name=1; CC IsoId=Q96P44-1; Sequence=Displayed; CC Name=2; CC IsoId=Q96P44-2; Sequence=VSP_031083, VSP_031085, VSP_031086; CC Name=3; CC IsoId=Q96P44-3; Sequence=VSP_031084; CC -!- TISSUE SPECIFICITY: Highly expressed in lymph node, jejunum, pancreas, CC stomach, trachea, testis, uterus and placenta; moderately expressed in CC brain, colon, lung, prostate, spinal cord, salivary gland and vascular CC smooth-muscle cells and very weakly expressed in heart, liver, kidney, CC bone marrow, spleen, thymus, skeletal muscle, adrenal gland and CC peripheral leukocytes. Expression in heart was higher in the right CC ventricle and atrium than in the left ventricle and atrium. CC {ECO:0000269|PubMed:11566190, ECO:0000269|PubMed:11863369}. CC -!- DEVELOPMENTAL STAGE: Highest expression observed at the fetal stage. CC Expressed by smooth-muscle cells in the artery wall in a PDGF-dependent CC way. {ECO:0000269|PubMed:11863369}. CC -!- INDUCTION: Stimulated by PDGF/platelet-derived growth factor. CC {ECO:0000269|PubMed:11863369}. CC -!- SIMILARITY: Belongs to the fibril-associated collagens with interrupted CC helices (FACIT) family. {ECO:0000305}. CC -!- SEQUENCE CAUTION: CC Sequence=AAH45597.1; Type=Frameshift; Evidence={ECO:0000305}; CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; AF414088; AAL02227.1; -; mRNA. DR EMBL; AF330693; AAL50033.1; -; mRNA. DR EMBL; AF438327; AAL86699.1; -; Genomic_DNA. DR EMBL; AL136624; CAB66559.1; -; mRNA. DR EMBL; AF370383; AAQ15219.1; -; mRNA. DR EMBL; AK313398; BAG36196.1; -; mRNA. DR EMBL; AL031782; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR EMBL; AL034452; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR EMBL; AL513530; -; NOT_ANNOTATED_CDS; Genomic_DNA. DR EMBL; BC045597; AAH45597.1; ALT_FRAME; mRNA. DR EMBL; BC126108; AAI26109.1; -; mRNA. DR CCDS; CCDS55025.1; -. [Q96P44-1] DR CCDS; CCDS83099.1; -. [Q96P44-3] DR RefSeq; NP_001305680.1; NM_001318751.1. [Q96P44-1] DR RefSeq; NP_001305681.1; NM_001318752.1. [Q96P44-3] DR RefSeq; NP_001305682.1; NM_001318753.1. DR RefSeq; NP_001305683.1; NM_001318754.1. DR RefSeq; NP_110447.2; NM_030820.3. [Q96P44-1] DR RefSeq; XP_011513226.1; XM_011514924.2. [Q96P44-1] DR RefSeq; XP_011513227.1; XM_011514925.2. [Q96P44-1] DR RefSeq; XP_011513228.1; XM_011514926.1. [Q96P44-1] DR RefSeq; XP_011513229.1; XM_011514927.1. [Q96P44-1] DR AlphaFoldDB; Q96P44; -. DR SMR; Q96P44; -. DR BioGRID; 123538; 8. DR ComplexPortal; CPX-1762; Collagen type XXI trimer. DR IntAct; Q96P44; 1. DR MINT; Q96P44; -. DR STRING; 9606.ENSP00000244728; -. DR GlyCosmos; Q96P44; 2 sites, 1 glycan. DR GlyGen; Q96P44; 2 sites, 1 O-linked glycan (1 site). DR iPTMnet; Q96P44; -. DR PhosphoSitePlus; Q96P44; -. DR BioMuta; COL21A1; -. DR DMDM; 74752071; -. DR jPOST; Q96P44; -. DR MassIVE; Q96P44; -. DR PaxDb; 9606-ENSP00000244728; -. DR PeptideAtlas; Q96P44; -. DR ProteomicsDB; 77615; -. [Q96P44-1] DR ProteomicsDB; 77616; -. [Q96P44-2] DR ProteomicsDB; 77617; -. [Q96P44-3] DR Antibodypedia; 31049; 114 antibodies from 20 providers. DR DNASU; 81578; -. DR Ensembl; ENST00000244728.10; ENSP00000244728.5; ENSG00000124749.18. [Q96P44-1] DR Ensembl; ENST00000370819.5; ENSP00000359855.1; ENSG00000124749.18. [Q96P44-3] DR GeneID; 81578; -. DR KEGG; hsa:81578; -. DR MANE-Select; ENST00000244728.10; ENSP00000244728.5; NM_030820.4; NP_110447.2. DR UCSC; uc003pcs.4; human. [Q96P44-1] DR AGR; HGNC:17025; -. DR CTD; 81578; -. DR DisGeNET; 81578; -. DR GeneCards; COL21A1; -. DR HGNC; HGNC:17025; COL21A1. DR HPA; ENSG00000124749; Tissue enhanced (cervix, placenta). DR MIM; 610002; gene. DR neXtProt; NX_Q96P44; -. DR OpenTargets; ENSG00000124749; -. DR PharmGKB; PA26714; -. DR VEuPathDB; HostDB:ENSG00000124749; -. DR eggNOG; KOG3544; Eukaryota. DR GeneTree; ENSGT00940000162318; -. DR HOGENOM; CLU_001074_18_0_1; -. DR InParanoid; Q96P44; -. DR OMA; CAHCQLQ; -. DR OrthoDB; 2883115at2759; -. DR PhylomeDB; Q96P44; -. DR TreeFam; TF332934; -. DR PathwayCommons; Q96P44; -. DR Reactome; R-HSA-1650814; Collagen biosynthesis and modifying enzymes. DR Reactome; R-HSA-8948216; Collagen chain trimerization. DR SIGNOR; Q96P44; -. DR BioGRID-ORCS; 81578; 6 hits in 1137 CRISPR screens. DR ChiTaRS; COL21A1; human. DR GenomeRNAi; 81578; -. DR Pharos; Q96P44; Tbio. DR PRO; PR:Q96P44; -. DR Proteomes; UP000005640; Chromosome 6. DR RNAct; Q96P44; Protein. DR Bgee; ENSG00000124749; Expressed in blood vessel layer and 168 other cell types or tissues. DR ExpressionAtlas; Q96P44; baseline and differential. DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW. DR GO; GO:0062023; C:collagen-containing extracellular matrix; HDA:BHF-UCL. DR GO; GO:0005829; C:cytosol; IDA:HPA. DR GO; GO:0005788; C:endoplasmic reticulum lumen; TAS:Reactome. DR GO; GO:0005576; C:extracellular region; TAS:Reactome. DR Gene3D; 2.60.120.200; -; 1. DR Gene3D; 1.20.5.320; 6-Phosphogluconate Dehydrogenase, domain 3; 1. DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 1. DR InterPro; IPR008160; Collagen. DR InterPro; IPR013320; ConA-like_dom_sf. DR InterPro; IPR048287; TSPN-like_N. DR InterPro; IPR002035; VWF_A. DR InterPro; IPR036465; vWFA_dom_sf. DR PANTHER; PTHR24020; COLLAGEN ALPHA; 1. DR PANTHER; PTHR24020:SF20; COLLAGEN ALPHA-1(XXI) CHAIN; 1. DR Pfam; PF01391; Collagen; 4. DR Pfam; PF00092; VWA; 1. DR PRINTS; PR00453; VWFADOMAIN. DR SMART; SM00210; TSPN; 1. DR SMART; SM00327; VWA; 1. DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1. DR SUPFAM; SSF53300; vWA-like; 1. DR PROSITE; PS50234; VWFA; 1. DR Genevisible; Q96P44; HS. PE 2: Evidence at transcript level; KW Alternative splicing; Collagen; Cytoplasm; Extracellular matrix; KW Glycoprotein; Reference proteome; Repeat; Secreted; Signal. FT SIGNAL 1..22 FT /evidence="ECO:0000255" FT CHAIN 23..957 FT /note="Collagen alpha-1(XXI) chain" FT /id="PRO_0000317613" FT DOMAIN 37..211 FT /note="VWFA" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219" FT DOMAIN 230..412 FT /note="Laminin G-like" FT DOMAIN 448..500 FT /note="Collagen-like 1" FT DOMAIN 501..542 FT /note="Collagen-like 2" FT DOMAIN 543..594 FT /note="Collagen-like 3" FT DOMAIN 681..733 FT /note="Collagen-like 4" FT DOMAIN 734..787 FT /note="Collagen-like 5" FT DOMAIN 825..882 FT /note="Collagen-like 6" FT DOMAIN 884..934 FT /note="Collagen-like 7" FT REGION 448..786 FT /note="Disordered" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT REGION 825..938 FT /note="Disordered" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT COMPBIAS 828..842 FT /note="Pro residues" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT COMPBIAS 888..908 FT /note="Pro residues" FT /evidence="ECO:0000256|SAM:MobiDB-lite" FT CARBOHYD 62 FT /note="N-linked (GlcNAc...) asparagine" FT /evidence="ECO:0000255" FT VAR_SEQ 1..600 FT /note="Missing (in isoform 2)" FT /evidence="ECO:0000303|PubMed:15498874" FT /id="VSP_031083" FT VAR_SEQ 427..429 FT /note="Missing (in isoform 3)" FT /evidence="ECO:0000303|PubMed:11863369" FT /id="VSP_031084" FT VAR_SEQ 601..604 FT /note="RGEP -> MIAS (in isoform 2)" FT /evidence="ECO:0000303|PubMed:15498874" FT /id="VSP_031085" FT VAR_SEQ 803..836 FT /note="Missing (in isoform 2)" FT /evidence="ECO:0000303|PubMed:15498874" FT /id="VSP_031086" FT VARIANT 277 FT /note="L -> P (in dbSNP:rs2764043)" FT /id="VAR_038555" FT VARIANT 343 FT /note="T -> M (in dbSNP:rs35471617)" FT /evidence="ECO:0000269|PubMed:14702039" FT /id="VAR_038556" FT VARIANT 495 FT /note="I -> T (in dbSNP:rs35583895)" FT /id="VAR_038557" FT VARIANT 560 FT /note="G -> S (in dbSNP:rs9382581)" FT /id="VAR_038558" FT VARIANT 747 FT /note="A -> D (in dbSNP:rs9464337)" FT /id="VAR_038559" FT VARIANT 821 FT /note="L -> P (in dbSNP:rs12209452)" FT /id="VAR_038560" FT VARIANT 827 FT /note="P -> A (in dbSNP:rs1555131)" FT /id="VAR_038561" FT CONFLICT 129 FT /note="A -> D (in Ref. 3; CAB66559)" FT /evidence="ECO:0000305" FT CONFLICT 780 FT /note="L -> W (in Ref. 7; AAH45597)" FT /evidence="ECO:0000305" FT CONFLICT 802 FT /note="R -> K (in Ref. 7; AAH45597)" FT /evidence="ECO:0000305" SQ SEQUENCE 957 AA; 99369 MW; 4C5CDF5E6656A675 CRC64; MAHYITFLCM VLVLLLQNSV LAEDGEVRSS CRTAPTDLVF ILDGSYSVGP ENFEIVKKWL VNITKNFDIG PKFIQVGVVQ YSDYPVLEIP LGSYDSGEHL TAAVESILYL GGNTKTGKAI QFALDYLFAK SSRFLTKIAV VLTDGKSQDD VKDAAQAARD SKITLFAIGV GSETEDAELR AIANKPSSTY VFYVEDYIAI SKIREVMKQK LCEESVCPTR IPVAARDERG FDILLGLDVN KKVKKRIQLS PKKIKGYEVT SKVDLSELTS NVFPEGLPPS YVFVSTQRFK VKKIWDLWRI LTIDGRPQIA VTLNGVDKIL LFTTTSVING SQVVTFANPQ VKTLFDEGWH QIRLLVTEQD VTLYIDDQQI ENKPLHPVLG ILINGQTQIG KYSGKEETVQ FDVQKLRIYC DPEQNNRETA CEIPGFNGEC LNGPSDVGST PAPCICPPGK PGLQGPKGDP GLPGNPGYPG QPGQDGKPGY QGIAGTPGVP GSPGIQGARG LPGYKGEPGR DGDKGDRGLP GFPGLHGMPG SKGEMGAKGD KGSPGFYGKK GAKGEKGNAG FPGLPGPAGE PGRHGKDGLM GSPGFKGEAG SPGAPGQDGT RGEPGIPGFP GNRGLMGQKG EIGPPGQQGK KGAPGMPGLM GSNGSPGQPG TPGSKGSKGE PGIQGMPGAS GLKGEPGATG SPGEPGYMGL PGIQGKKGDK GNQGEKGIQG QKGENGRQGI PGQQGIQGHH GAKGERGEKG EPGVRGAIGS KGESGVDGLM GPAGPKGQPG DPGPQGPPGL DGKPGREFSE QFIRQVCTDV IRAQLPVLLQ SGRIRNCDHC LSQHGSPGIP GPPGPIGPEG PRGLPGLPGR DGVPGLVGVP GRPGVRGLKG LPGRNGEKGS QGFGYPGEQG PPGPPGPEGP PGISKEGPPG DPGLPGKDGD HGKPGIQGQP GPPGICDPSL CFSVIARRDP FRKGPNY //