ID CSOS2_HALNC Reviewed; 869 AA. AC O85041; D0KZ90; DT 10-FEB-2021, integrated into UniProtKB/Swiss-Prot. DT 01-NOV-1998, sequence version 1. DT 10-FEB-2021, entry version 30. DE RecName: Full=Carboxysome assembly protein CsoS2B {ECO:0000303|PubMed:25826651}; DE AltName: Full=Carboxysome shell protein CsoS2B; GN Name=csoS2 {ECO:0000303|PubMed:9696760}; OrderedLocusNames=Hneap_0920; OS Halothiobacillus neapolitanus (strain ATCC 23641 / c2) (Thiobacillus OS neapolitanus). OC Bacteria; Proteobacteria; Gammaproteobacteria; Chromatiales; OC Halothiobacillaceae; Halothiobacillus. OX NCBI_TaxID=555778; RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC DNA]. RC STRAIN=ATCC 23641 / c2; RX PubMed=9696760; DOI=10.1128/jb.180.16.4133-4139.1998; RA Baker S.H., Jin S., Aldrich H.C., Howard G.T., Shively J.M.; RT "Insertion mutation of the form I cbbL gene encoding ribulose bisphosphate RT carboxylase/oxygenase (RuBisCO) in Thiobacillus neapolitanus results in RT expression of form II RuBisCO, loss of carboxysomes, and an increased CO2 RT requirement for growth."; RL J. Bacteriol. 180:4133-4139(1998). RN [2] RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. RC STRAIN=ATCC 23641 / c2; RG US DOE Joint Genome Institute; RA Lucas S., Copeland A., Lapidus A., Glavina del Rio T., Tice H., Bruce D., RA Goodwin L., Pitluck S., Davenport K., Brettin T., Detter J.C., Han C., RA Tapia R., Larimer F., Land M., Hauser L., Kyrpides N., Mikhailova N., RA Kerfeld C., Cannon G., Heinhort S.; RT "Complete sequence of Halothiobacillus neapolitanus c2."; RL Submitted (OCT-2009) to the EMBL/GenBank/DDBJ databases. RN [3] RP PROTEIN SEQUENCE OF 2-18, SUBCELLULAR LOCATION, TWO PROTEIN FORMS, AND RP PUTATIVE GLYCOSYLATION. RC STRAIN=ATCC 23641 / c2; RX PubMed=10525740; DOI=10.1007/s002030050765; RA Baker S.H., Lorbach S.C., Rodriguez-Buey M., Williams D.S., Aldrich H.C., RA Shively J.M.; RT "The correlation of the gene csoS2 of the carboxysome operon with two RT polypeptides of the carboxysome in Thiobacillus neapolitanus."; RL Arch. Microbiol. 172:233-239(1999). RN [4] RP FUNCTION, PROTEIN ABUNDANCE, AND SUBCELLULAR LOCATION. RX DOI=10.1007/7171_023; RA Heinhorst S., Cannon G.C., Shively J.M.; RT "Carboxysomes and Carboxysome-like Inclusions."; RL (In) Shively J.M. (eds.); RL Microbiology Monographs, pp.2:141-164, Springer-Verlag, Berlin (2006). RN [5] RP SUBCELLULAR LOCATION. RC STRAIN=ATCC 23641 / c2; RX PubMed=18258595; DOI=10.1074/jbc.m709285200; RA Dou Z., Heinhorst S., Williams E.B., Murin C.D., Shively J.M., Cannon G.C.; RT "CO2 fixation kinetics of Halothiobacillus neapolitanus mutant carboxysomes RT lacking carbonic anhydrase suggest the shell acts as a diffusional barrier RT for CO2."; RL J. Biol. Chem. 283:10377-10384(2008). RN [6] RP BIOTECHNOLOGY. RC STRAIN=ATCC 23641 / c2; RX PubMed=22184212; DOI=10.1073/pnas.1108557109; RA Bonacci W., Teng P.K., Afonso B., Niederholtmeyer H., Grob P., Silver P.A., RA Savage D.F.; RT "Modularity of a carbon-fixing protein organelle."; RL Proc. Natl. Acad. Sci. U.S.A. 109:478-483(2012). RN [7] RP FUNCTION, SUBUNIT, SUBCELLULAR LOCATION, DOMAIN, TWO PROTEIN FORMS, AND RP DISRUPTION PHENOTYPE. RC STRAIN=ATCC 23641 / c2; RX PubMed=25826651; DOI=10.3390/life5021141; RA Cai F., Dou Z., Bernstein S.L., Leverenz R., Williams E.B., Heinhorst S., RA Shively J., Cannon G.C., Kerfeld C.A.; RT "Advances in Understanding Carboxysome Assembly in Prochlorococcus and RT Synechococcus Implicate CsoS2 as a Critical Component."; RL Life 5:1141-1171(2015). RN [8] RP FUNCTION, SUBCELLULAR LOCATION, RIBOSOMAL FRAMESHIFT, ISOFORMS CSOS2A AND RP CSOS2B, DOMAIN, AND MASS SPECTROMETRY. RX PubMed=26608811; DOI=10.1016/j.jmb.2015.11.017; RA Chaijarasphong T., Nichols R.J., Kortright K.E., Nixon C.F., Teng P.K., RA Oltrogge L.M., Savage D.F.; RT "Programmed Ribosomal Frameshifting Mediates Expression of the alpha- RT Carboxysome."; RL J. Mol. Biol. 428:153-164(2016). RN [9] RP INTERACTION WITH CBBS, AND BIOTECHNOLOGY. RC STRAIN=ATCC 23641 / c2; RX PubMed=30305640; DOI=10.1038/s41598-018-33074-x; RA Liu Y., He X., Lim W., Mueller J., Lawrie J., Kramer L., Guo J., Niu W.; RT "Deciphering molecular details in the assembly of alpha-type carboxysome."; RL Sci. Rep. 8:15062-15062(2018). RN [10] RP FUNCTION, INTERACTION WITH RUBISCO, DOMAIN, AND MUTAGENESIS OF RP 18-LYS--ARG-25; 93-ARG--ARG-100; 174-ARG--ARG-181 AND 220-ARG--ARG-227. RX PubMed=32123388; DOI=10.1038/s41594-020-0387-7; RA Oltrogge L.M., Chaijarasphong T., Chen A.W., Bolin E.R., Marqusee S., RA Savage D.F.; RT "Multivalent interactions between CsoS2 and Rubisco mediate alpha- RT carboxysome formation."; RL Nat. Struct. Mol. Biol. 27:281-287(2020). RN [11] RP FUNCTION IN ASSEMBLY, DOMAIN, DISRUPTION PHENOTYPE, BIOTECHNOLOGY, RP ENCAPSULATION SIGNAL, AND MUTAGENESIS OF 839-PRO--GLY-869. RX PubMed=33116131; DOI=10.1038/s41467-020-19280-0; RA Li T., Jiang Q., Huang J., Aitchison C.M., Huang F., Yang M., Dykes G.F., RA He H.L., Wang Q., Sprick R.S., Cooper A.I., Liu L.N.; RT "Reprogramming bacterial protein organelles as a nanoreactor for hydrogen RT production."; RL Nat. Commun. 11:5448-5448(2020). CC -!- FUNCTION: Required for alpha-carboxysome (Cb) assembly, mediates CC interaction between RuBisCO and the Cb shell (PubMed:25826651, CC PubMed:32123388) (Probable). The 3 C-terminal repeats act as the CC encapsulation signal to target proteins to the Cb; they are necessary CC and sufficient to target both CsoS2 and foreign proteins to the Cb CC (PubMed:33116131). The N-terminal repeats of this (probably) CC intrinsically disordered protein bind simultaneously to both subunits CC of RuBisCO; minimally 2 N-repeats are necessary for RuBisCO assembly CC into the Cb in vivo. Probably also interacts with the major shell CC proteins (CsoS1); that interaction would increase the local CC concentration of CsoS2 so that it can condense RuBisCO and full CC carboxysomes can be formed (PubMed:32123388). The long form is CC essential for Cb formation while the short form is not CC (PubMed:26608811). There are estimated to be 143 CsoS2A and 186 CsoS2B CC proteins per Cb (Ref.4). {ECO:0000269|PubMed:25826651, CC ECO:0000269|PubMed:26608811, ECO:0000269|PubMed:32123388, CC ECO:0000269|PubMed:33116131, ECO:0000269|Ref.4, CC ECO:0000305|PubMed:33116131}. CC -!- FUNCTION: Unlike beta-carboxysomes, alpha-carboxysomes (Cb) can form CC without cargo protein. CsoS2 is essential for Cb formation and is also CC capable of targeting foreign proteins to the Cb. The Cb shell assembles CC with the aid of CsoS2; CsoS1A, CsoS1B and CsoS1C form the majority of CC the shell while CsoS4A and CsoS4B form vertices. CsoS1D forms CC pseudohexamers that probably control metabolite flux into and out of CC the shell. {ECO:0000269|PubMed:25826651, ECO:0000269|PubMed:26608811, CC ECO:0000269|PubMed:32123388, ECO:0000269|PubMed:33116131}. CC -!- SUBUNIT: Probably interacts with the carboxysome (Cb) major shell CC protein CsoS1; this complex probably also interacts with RuBisCO CC (Probable). Interacts with CbbS (the small subunit of RuBisCO) but not CC the large subunit (CbbL) (PubMed:30305640). RuBisCO interacts with the CC N-terminal repeats of this protein; mutating the third and tenth basic CC residue of each N-repeat prevents binding. Binding is sensitive to CC ionic strength. A fusion of a single N-terminal repeat to the C- CC terminus of the large subunit of RuBisCO (cbbL) shows the repeat can CC lie between a CbbL dimer, making minor contacts to CbbS; thus each CC RuBisCO holoenzyme could bind 8 repeats. At least 2 N-repeats are CC required for RuBisCO assembly into the Cb (PubMed:32123388). CC {ECO:0000269|PubMed:30305640, ECO:0000269|PubMed:32123388, CC ECO:0000305|PubMed:25826651}. CC -!- SUBCELLULAR LOCATION: Carboxysome {ECO:0000269|PubMed:10525740, CC ECO:0000269|PubMed:18258595, ECO:0000269|PubMed:25826651, CC ECO:0000269|PubMed:26608811}. Note=Immunogold staining shows this is a CC shell protein (PubMed:10525740). C-terminally tagged protein CC immunoprecipitates whole carboxysomes, showing the C-terminus is on the CC exterior (PubMed:25826651). This bacterium makes alpha-type CC carboxysomes (Ref.4, PubMed:18258595). {ECO:0000269|PubMed:10525740, CC ECO:0000269|PubMed:18258595, ECO:0000269|PubMed:25826651, CC ECO:0000269|Ref.4}. CC -!- ALTERNATIVE PRODUCTS: CC Event=Ribosomal frameshifting; Named isoforms=2; CC Comment=The production of the two protein products from this region CC is due to programmed -1 ribosomal frameshifting. In vivo about half CC the protein product is frameshifted. {ECO:0000269|PubMed:26608811, CC ECO:0000305|PubMed:10525740, ECO:0000305|PubMed:25826651}; CC Name=CsoS2B; CC IsoId=O85041-1; Sequence=Displayed; CC Name=CsoS2A; CC IsoId=O85041-2; Sequence=VSP_060905, VSP_060906; CC -!- DOMAIN: Has 3 domains; the N-terminal domain has 4 short repeats and CC binds RuBisCO. The central region has 6 longer repeats (Probable) CC (PubMed:32123388). The C-terminal domain has 3 repeats and a highly CC conserved C-terminal peptide (Probable). The 3 C-repeats serve as the CC encapsulation signal for the alpha-carboxysome (Cb), and are able to CC target foreign proteins to this organelle. Proteins can be targeted to CC the Cb by a single C-repeat, however more repeats yields more efficient CC targeting. The C-terminal peptide (CTP) is not required for cargo CC targeting and is probably on the outside of the Cb (PubMed:33116131, CC PubMed:25826651). {ECO:0000269|PubMed:25826651, CC ECO:0000269|PubMed:32123388, ECO:0000269|PubMed:33116131, CC ECO:0000305|PubMed:25826651, ECO:0000305|PubMed:26608811}. CC -!- PTM: Seen in gels as 2 forms (of 85 and 130 kDa, equal amounts of each) CC which have the same N-terminus, called respectively CsoS2A and CsoS2B CC (PubMed:10525740). Partial tryptic digestion and mass spectometric CC analysis, as well as the presence of the shorter form even when a C- CC terminally tagged version is engineered, suggests CsoS2A is shorter at CC the C-terminus, but its sequence is not known (Probable). It has been CC shown these 2 forms are produced by ribosomal frameshifting CC (PubMed:26608811). {ECO:0000269|PubMed:10525740, CC ECO:0000269|PubMed:26608811, ECO:0000305|PubMed:25826651}. CC -!- MASS SPECTROMETRY: Mass=92.3; Method=MALDI; Note=Expressed in E.coli, CC CsoS2B (full-length) form.; Evidence={ECO:0000269|PubMed:26608811}; CC -!- MASS SPECTROMETRY: Mass=60.8; Method=MALDI; Note=Expressed in E.coli, CC CsoS2A (short) form.; Evidence={ECO:0000269|PubMed:26608811}; CC -!- DISRUPTION PHENOTYPE: Does not grow in air but does grow in 2% CO(2), CC called a high-CO(2) requiring phenotype, hcr. Cells do not make CC carboxysomes (PubMed:25826651). Also absolutely required for CC carboxysome formation in E.coli (PubMed:33116131). CC {ECO:0000269|PubMed:25826651, ECO:0000269|PubMed:33116131}. CC -!- BIOTECHNOLOGY: Expression of 10 genes for alpha-carboxysome (Cb) CC proteins (cbbL-cbbS-csoS2-csoS3-csoS4A-csoS4B-csoS1C-csoS1A-csoS1B- CC csoS1D) in E.coli generates compartments that resemble Cb, contain CC RuBisCO and have its catalytic activity, showing it is possible to make CC artificial, functional Cb using these 10 genes. Cargo proteins can be CC targeted to these organelles (PubMed:22184212, PubMed:30305640). CC Artifical Cb assembly in E.coli requires csoS2-csoS4A-csoS4B-csoS1C- CC csoS1A-csoS1B-csoS1D (but not the gene for carbonic anhydrase, csoS3). CC Targeting proteins to the organelle requires at least one of the CsoS2 CC C-repeats; 3 repeats gives the best localization. A nanoreactor of the CC Cb shell proteins has been engineered which generates H(2) using a CC ferredoxin-hydrogenase fusion (AC P07839-Q9FYU1) and a CC flavodoxin/ferredoxin--NADP reductase (AC A0A0K3QZA5) targeted CC separately to the Cb; the hydrogenase has first to be matured and CC activated by HydGXEF (AC Q8EAH9, Q8EAH8, Q8EAH7 and Q8EAH6 CC respectively). Encapsulation increases H(2) production about 20% during CC anaerobic growth, and over 4-fold more during aerobic growth CC (PubMed:33116131). {ECO:0000269|PubMed:22184212, CC ECO:0000269|PubMed:30305640, ECO:0000269|PubMed:33116131}. CC -!- SIMILARITY: Belongs to the CsoS2 family. {ECO:0000305}. CC -!- CAUTION: Both Cso2A and Cso2B were originally thought to be CC glycosylated (PubMed:10525740). Later experiments do not show any CC evidence of glycosylation (PubMed:26608811). CC {ECO:0000269|PubMed:10525740, ECO:0000269|PubMed:26608811}. CC -!- SEQUENCE CAUTION: CC Sequence=ACX95763.1; Type=Erroneous initiation; Note=Truncated N-terminus.; Evidence={ECO:0000305}; CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; AF038430; AAC32551.1; -; Genomic_DNA. DR EMBL; CP001801; ACX95763.1; ALT_INIT; Genomic_DNA. DR EnsemblBacteria; ACX95763; ACX95763; Hneap_0920. DR KEGG; hna:Hneap_0920; -. DR eggNOG; ENOG502Z8T4; Bacteria. DR HOGENOM; CLU_016451_0_0_6; -. DR OMA; DEPGTCK; -. DR Proteomes; UP000009102; Chromosome. DR InterPro; IPR020990; Carboxysome_shell. DR Pfam; PF12288; CsoS2_M; 3. PE 1: Evidence at protein level; KW Carbon dioxide fixation; Carboxysome; Direct protein sequencing; KW Reference proteome; Repeat; Ribosomal frameshifting. FT CHAIN 1..869 FT /note="Carboxysome assembly protein CsoS2B" FT /id="PRO_0000452067" FT REPEAT 16..35 FT /note="N-repeat 1" FT /evidence="ECO:0000269|PubMed:25826651, FT ECO:0000269|PubMed:32123388" FT REPEAT 91..110 FT /note="N-repeat 2" FT /evidence="ECO:0000269|PubMed:25826651, FT ECO:0000269|PubMed:32123388" FT REPEAT 172..191 FT /note="N-repeat 3" FT /evidence="ECO:0000269|PubMed:25826651, FT ECO:0000269|PubMed:32123388" FT REPEAT 218..237 FT /note="N-repeat 4" FT /evidence="ECO:0000269|PubMed:25826651, FT ECO:0000269|PubMed:32123388" FT REPEAT 260..309 FT /note="M-repeat 1" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 319..368 FT /note="M-repeat 2" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 379..419 FT /note="M-repeat 3" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 435..484 FT /note="M-repeat 4" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 494..539 FT /note="M-repeat 5" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 545..594 FT /note="M-repeat 6" FT /evidence="ECO:0000269|PubMed:25826651" FT REPEAT 604..651 FT /note="C-repeat 1" FT /evidence="ECO:0000269|PubMed:26608811" FT REPEAT 690..730 FT /note="C-repeat 2" FT /evidence="ECO:0000269|PubMed:26608811" FT REPEAT 773..813 FT /note="C-repeat 3" FT /evidence="ECO:0000269|PubMed:26608811" FT REGION 1..259 FT /note="N-terminal domain" FT /evidence="ECO:0000305|PubMed:25826651, FT ECO:0000305|PubMed:32123388" FT REGION 260..603 FT /note="Middle region" FT /evidence="ECO:0000305|PubMed:25826651, FT ECO:0000305|PubMed:32123388" FT REGION 604..838 FT /note="C-terminal domain" FT /evidence="ECO:0000305|PubMed:25826651, FT ECO:0000305|PubMed:32123388" FT REGION 839..869 FT /note="C-terminal peptide (CTP)" FT /evidence="ECO:0000305|PubMed:26608811, FT ECO:0000305|PubMed:32123388" FT VAR_SEQ 568..571 FT /note="MSGD -> DVR (in isoform CsoS2A)" FT /id="VSP_060905" FT VAR_SEQ 572..869 FT /note="Missing (in isoform CsoS2A)" FT /id="VSP_060906" FT MUTAGEN 18..25 FT /note="KELARARR->AELARARA: Prevents RuBisCO binding; when FT associated with all 4 N-repeat mutations." FT /evidence="ECO:0000269|PubMed:32123388" FT MUTAGEN 93..100 FT /note="RDLCRQRR->ADLCRQRA: Prevents RuBisCO binding; when FT associated with all 4 N-repeat mutations." FT /evidence="ECO:0000269|PubMed:32123388" FT MUTAGEN 174..181 FT /note="RDICRARR->ADICRARA: Prevents RuBisCO binding; when FT associated with all 4 N-repeat mutations." FT /evidence="ECO:0000269|PubMed:32123388" FT MUTAGEN 220..227 FT /note="RDAAKRHR->ADAAKRHA: Prevents RuBisCO binding; when FT associated with all 4 N-repeat mutations." FT /evidence="ECO:0000269|PubMed:32123388" FT MUTAGEN 839..869 FT /note="Missing: Forms normal carboxysomes (in artifical FT E.coli carboxysomes)." FT /evidence="ECO:0000269|PubMed:33116131" FT CONFLICT 111..114 FT /note="AKTT -> VKTN (in Ref. 2; ACX95763)" FT /evidence="ECO:0000305" SQ SEQUENCE 869 AA; 91934 MW; A2C48415DDF7589C CRC64; MPSQSGMNPA DLSGLSGKEL ARARRAALSK QGKAAVSNKT ASVNRSTKQA ASSINTNQVR SSVNEVPTDY QMADQLCSTI DHADFGTESN RVRDLCRQRR EALSTIGKKA AKTTGKPSGR VRPQQSVVHN DAMIENAGDT NQSSSTSLNN ELSEICSIAD DMPERFGSQA KTVRDICRAR RQALSERGTR AVPPKPQSQG GPGRNGYQID GYLDTALHGR DAAKRHREML CQYGRGTAPS CKPTGRVKNS VQSGNAAPKK VETGHTLSGG SVTGTQVDRK SHVTGNEPGT CRAVTGTEYV GTEQFTSFCN TSPKPNATKV NVTTTARGRP VSGTEVSRTE KVTGNESGVC RNVTGTEYMS NEAHFSLCGT AAKPSQADKV MFGATARTHQ VVSGSDEFRP SSVTGNESGA KRTITGSQYA DEGLARLTIN GAPAKVARTH TFAGSDVTGT EIGRSTRVTG DESGSCRSIS GTEYLSNEQF QSFCDTKPQR SPFKVGQDRT NKGQSVTGNL VDRSELVTGN EPGSCSRVTG SQYGQSKICG GGVGKVRSMR TLRGTSVSGQ QLDHAPKMSG DERGGCMPVT GNEYYGREHF EPFCTSTPEP EAQSTEQSLT CEGQIISGTS VDASDLVTGN EIGEQQLISG DAYVGAQQTG CLPTSPRFNQ TGNVQSMGFK NTNQPEQNFA PGEVMPTDFS IQTPARSAQN RITGNDIAPS GRITGPGMLA TGLITGTPEF RHAARELVGS PQPMAMAMAN RNKAAQAPVV QPEVVATQEK PELVCAPRSD QMDRVSGEGK ERCHITGDDW SVNKHITGTA GQWASGRNPS MRGNARVVET SAFANRNVPK PEKPGSKITG SSGNDTQGSL ITYSGGARG //