Stages of biomolecular condensate formation in pro-β-carboxysome assembly
Kun Zang, Xiaoyu Hong, Nghiem D. Nguyen, Loraine M. Rourke, Jiwon Lee, Benedict M. Long, G. Dean Price, Manajit Hayer-Hartl

TL;DR
The paper explains how a key protein, ApN, helps assemble carboxysomes in cyanobacteria, which could help in engineering these structures for use in plants.
Contribution
The study reveals the specific role of ApN in forming the pro-carboxysome, a crucial step in carboxysome assembly.
Findings
ApN forms a hetero-complex with CM at the pro-carboxysome periphery.
Shell formation occurs only after Rubisco and CA are assembled into the pro-carboxysome core.
This mechanism provides insight into β-carboxysome assembly for potential plant engineering.
Abstract
Cyanobacteria have evolved a CO2-concentrating mechanism (CCM) in the form of a microcompartment with a proteinaceous shell called carboxysome, harbouring the photosynthetic enzyme Rubisco and carbonic anhydrase (CA). β-Carboxysome assembly proceeds by an inside-out process, in which Rubisco, CA and the shell adaptor protein ApN (also known as CcmN) first form the pro-carboxysome biomolecular condensate mediated by the scaffolding protein CM (also known as CcmM). How ApN assembles into the pro-carboxysome as a prerequisite for shell formation has remained unclear. Here we show that ApN is recruited to the periphery of the pro-carboxysome as a hetero-complex of three ApN protomers and one CM protomer. The association of (ApN)3:CM at the rim of the pro-carboxysome ensures that shell formation and maturation of the carboxysome proceeds only after assembly of the two enzymes, Rubisco and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhotosynthetic Processes and Mechanisms · Metalloenzymes and iron-sulfur proteins · Lipid metabolism and biosynthesis
Main
Bacterial microcompartments (BMCs) are specialized proteinaceous organelles that house specific metabolic pathways^1–3^. A well-studied and highly biologically relevant example is the carboxysome of all cyanobacteria and many chemoautotrophic bacteria^4–6^. Carboxysomes enclose the key photosynthetic enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) together with carbonic anhydrase (CA), creating a high-CO_2_ microenvironment to enhance the efficiency of carbon fixation by Rubisco^7–9^. Understanding carboxysome biogenesis is key to both unravelling the principles of cellular compartmentalization^6,10^ and engineering carboxysomes into chloroplasts to enhance plant carbon fixation^9,11–13^.
The Rubisco enzyme of α- and β-carboxysomes is a complex of ~550 kDa consisting of eight large (RbcL; ~55 kDa) and eight small (RbcS; ~13 kDa) subunits. However, the two carboxysome types differ in the sequence of their Rubisco RbcL subunits^14^: α-carboxysomes in proteobacteria and α-cyanobacteria contain form 1A, while β-carboxysomes found only in β-cyanobacteria contain prokaryotic form 1B Rubisco^15^. The prokaryotic form 1B Rubisco closely resembles the eukaryotic form of plants and green algae, making it a relevant model for chloroplast adaptation. The α- and β-carboxysomes also differ in their protein components and assembly mechanism^6,16–18^. The proteinaceous shell allows HCO_3_^−^ entry but retards the escape of CO_2_ generated by CA inside the oxidizing carboxysome^19–21^ (Fig. 1a). This concentrates CO_2_ in the vicinity of Rubisco (so-called CO_2_-concentrating mechanism, CCM), competitively inhibiting Rubisco oxygenation, which leads to energetically costly photorespiration^7,22,23^. The cyanobacterial CCM also entails a mechanism to elevate HCO_3_^−^ in the cytosol via active transporters in the cell membranes^24,25^. In addition, carboxysomal Rubisco enzymes have carboxylation properties, including high catalytic turnover rates, suited to improving carbon fixation in C_3_ crops equipped with a chloroplast CCM capable of supplying Rubisco with saturating CO_2_ (refs. ^9,26,27^).Fig. 1. Analysis of the shell adaptor protein, ApN.a, CO_2_-concentrating mechanism (CCM) of β-cyanobacteria. 3PGA, 3-phosphoglycerate; RuBP, ribulose-1,5-bisphosphate. b, Cartoon representation of the key carboxysome genes of β-cyanobacterium Se7942. The genes for pro-β-carboxysome assembly: ccmM, scaffolding protein CM; ccmN, shell adaptor protein ApN; rbcL/rbcS, Rubisco; ccaA, carbonic anhydrase CA. c, Domain structures of the scaffolding protein (CM), its truncated form (CM^Ct^) and the shell adaptor protein (ApN), with approximate molecular weights indicated. d, SEC–MALS analysis of ApN and SIIApN. The theoretical molar masses of tetrameric ApN and SIIApN are 65.31 and 70.85 kDa, respectively. The measured molar masses are indicated. Representative data of 2 independent experiments are shown (n = 2). e, Cryo-EM single-particle analysis of ApN tetramers. Shown are representative 2D class averages from Extended Data Fig. 1b. Scale bar, 50 Å. f, AlphaFold 3 (AF3) structural model in ribbon representation of (ApN)4 complex (residues 11–114) shown in end-on view with C termini on top (bottom view). The three walls of the triangular-shaped ApN protomer are labelled as A, B and C. g,h, The interaction surface between walls A and C in (ApN)4. Two adjacent ApN protomers are shown in ribbon and surface representations (g), and walls A and C, in surface representation, exposed to the viewer (h). The two conserved cysteines, C49 and C88, are indicated by dashed boxes. Residue colouring based on their hydrophobicity moment and charge. Hydrophobic, yellow; positive charge, blue; negative charge, red. i, Condensate formation analysed by fluorescence microscopy (FM). The experimental scheme, and protein concentrations and Alexa fluorophores (AF) used are indicated. Labelled proteins were used at a 1:10 ratio with unlabelled protein. Representative data of 2 independent experiments are shown (n = 2). DIC, differential interference contrast. Scale bar, 10 μm.Source data
The genes essential for the biogenesis of α- and β-carboxysomes are encoded in the cso and ccm operons, respectively^5,6,28,29^. The ccm operon of the β-cyanobacterium Synechococcus elongatus PCC7942 (Se7942) encodes the protein CcmM that functions as the central organizing scaffold during pro-β-carboxysome formation^30^ (Fig. 1b). CcmM (henceforth CM) in Se7942 forms a functional homo-trimer^30^ of ~58 kDa subunits. The CM protomer is composed of an N-terminal γ-carbonic anhydrase-like (γCAL) domain followed by a C-terminal (Ct) domain containing three Rubisco small subunit-like (SSUL) modules connected by flexible linkers^31^ (Fig. 1c). Trimerization of CM is mediated via the γCAL domains^20,30^, resulting in a (CM)3 complex with a total of nine SSUL modules. A smaller isoform of CM, generated from an internal ribosome binding site^31^, comprises only the three SSUL modules (CM^Ct^) (Fig. 1c). Both (CM)3 and CM^Ct^ are required for β-carboxysome biogenesis in vivo^32^. We recently reported that (CM)3 recruits Rubisco (RbcL_8_S_8_) and the tetrameric CA via multiple, interwoven co-assembly reactions of its SSUL modules with Rubisco and of the γCAL domains with the C-terminal helices of (CA)4 (ref. ^30^). These interactions result in the formation of an essentially immobile protein network within a phase-separated droplet, representing the pro-carboxysome core before formation of the proteinaceous shell. For maturation of β-carboxysomes, it is proposed that (CM)3 and the adaptor protein CcmN (henceforth ApN; ~16 kDa protomer) facilitate interactions with the shell proteins, initiating shell formation^33–35^. A recent X-ray structure showed that CM and ApN form a hetero-trimer, consisting of one CM and two ApN protomers, and suggested that this complex functions as the adaptor for recruitment of shell proteins and maturation of the β-carboxysome^36^. ApN appears not to be required for the initial stages of β-carboxysome biogenesis. Deletion of the ccmN gene in Se7942 allowed the assembly of pro-carboxysomes, but resulted in high CO_2_ requirement for growth^16,34^, consistent with defective shell assembly. How the hetero-complex of ApN and CM is recruited into the pro-carboxysome as a prerequisite for shell formation is not yet understood. Moreover, which shell proteins directly interact with ApN remains unclear^31,33–38^.
Reconstituting a carboxysome-based CCM in plants to improve photosynthetic efficiency remains a major challenge^9,11–13^, particularly for β-carboxysomes, where coordinated shell formation has proven elusive^37,39^. Although assembly of both simplified and more complete α-carboxysomes has been achieved in plants^40,41^, efforts to reconstitute β-carboxysomes have progressed more slowly. Expression of the Se7942 β-carboxysome Rubisco and the scaffolding protein CM in Nicotiana tabacum chloroplasts led to the formation of Rubisco-enriched aggregates^42^, while the transient expression of shell proteins in N. benthamiana produced discrete structures reminiscent of β-carboxysomes^37^.
Here we use a combination of biochemistry, structural analysis and in vivo mutational analysis to define the functional ApN/CM hetero-complex and the stage at which it acts as a prerequisite for proteinaceous shell assembly and carboxysome maturation. We find that only when co-translated, ApN and CM form an (ApN)3:CM hetero-tetramer. This complex assembles at the periphery of the initial Rubisco/CM^Ct^/(CM)3 condensate upon in vitro reconstitution, with CA either incorporated before or simultaneously with (ApN)3:CM at the rim of the pro-carboxysome. The (ApN)3:CM complex converts to an (ApN)2:CM hetero-trimer induced by oxidation of conserved cysteine residues of ApN. As the interior of the carboxysome acquires oxidizing properties, this structural conversion may be relevant in shell formation, consistent with in vivo complementation experiments showing that the cysteine residues are required for efficient carboxysome function and/or formation. Our findings suggest that successful reconstitution of β-carboxysomes in chloroplasts will probably require coordinated co-expression of the scaffolding protein CM and the shell adaptor protein ApN, as well as consideration of redox-regulated interactions, to ensure stoichiometrically accurate shell-to-lumen organization.
RESULTS
ApN forms a tetramer in solution
To investigate the role of ApN in β-carboxysome biogenesis, we recombinantly expressed Se7942 ApN in E. coli with an N-terminal Strep-tag II (SIIApN; 17.8 kDa protomer) and purified the protein in the presence of reducing agent (see Methods). Analysis of SIIApN by size-exclusion chromatography coupled to multiangle static light scattering (SEC–MALS) indicated a molar mass of ~69.3 kDa (Fig. 1d and Supplementary Table 1), suggesting that SIIApN is a tetramer (theoretical mass 70.85 kDa). We also obtained a tetrameric state for tag-free ApN (~63.5 kDa, theoretical mass 65.31 kDa) (Fig. 1d and Supplementary Table 1).
Next, to obtain insight into the structure of ApN, we performed cryo-electron microscopy (cryo-EM) and single-particle analysis on the untagged protein under reducing conditions, mimicking the environment of the cyanobacterial cytosol. The two-dimensional (2D) class averages of the small particles, obtained at the low resolution of ~6.0 Å, revealed the presence of a 4-fold symmetric complex (Fig. 1e and Extended Data Fig. 1a,b), consistent with ApN being a tetramer in solution (Fig. 1d). Note that due to a preferred particle orientation, the information for the Fourier space is missing, making subsequent 3D reconstruction impossible. To overcome this problem, we tested various strategies, such as coating grids with support films^43^, pretreating with polylysine^44^ or employing tilts during data collection^45^. However, none of these measures could overcome the challenges imposed by the low molecular weight (ApN)4 complex (~65 kDa). Nevertheless, the triangular shape of each protomer is consistent with the end-on view of ApN in a recently reported hetero-trimeric crystal structure, consisting of two SUMO-tagged ApN protomers (SUMO-ApN) and one protomer of the N-terminal γCAL domain (residues 1–209) of CM^36^. The tetramers observed in the cryo-EM 2D class averages are structurally similar to the tetrameric arrangement of the ApN protomers predicted by AlphaFold 3 (AF3)^46^ for residues 11–114 (Fig. 1e,f), with scores for predicted template modelling (pTM) and interface pTM (ipTM) of 0.58 and 0.55, respectively^47^. Structurally, the ApN protomer consists of an N-terminal domain (residues 1–118) that forms a 5-turn left-handed β-helical barrel^36^, similar to the N-terminal γCAL domain of CM (residues 1–181) which forms a 7-turn left-handed β-helical barrel^20,30^ (Extended Data Fig. 1c). The β-helix domain of ApN is followed by a flexible region of ~50 amino acids containing the so-called encapsulation peptide (EP) at the extreme C terminus (Fig. 1c). The EP sequence is predicted to form an amphipathic α-helix and is required for anchoring to shell proteins^16,33,34,37,38^. The low-confidence prediction, by AF3, of the flexible C-terminal domains in (ApN)4, containing the EP helices (Extended Data Fig. 1d), suggests conformational flexibility. This is consistent with the absence of observable density in both the 2D cryo-EM classes of (ApN)4 (Fig. 1e and Extended Data Fig. 1b) and the crystal structure of the (SUMO-ApN)2:CM(1–209) hetero-trimer^36^.
By aligning the cryo-EM 2D classes of (ApN)4 with the orientation of the AF3 structural model, the triangular shape of the β-helical ApN protomers observed in the cryo-EM end-on view indicates that the C terminus is located at the top and the N terminus at the bottom, resulting in an anti-clockwise positioning of wall A to wall C (Fig. 1e,f and Extended Data Fig. 1e). A hydrophobicity moment plot^48^ of this interface revealed that several surface-exposed hydrophobic residues from both A and C walls stabilize the interaction (Fig. 1g,h), while the surface-exposed residues of the four B walls surrounding the tetramer and the loops connecting to walls A and C contain several charged residues (Fig. 1g). In the (ApN)4 complex, wall A is always packed against wall C (Fig. 1f), which harbours two highly conserved cysteines, Cys49 and Cys88 (Fig. 1h).
In summary, ApN on its own forms a tetramer, raising the question of how it interacts with (CM)3 or other components of the pro-carboxysome before shell assembly.
Tetrameric ApN does not participate in pro-β-carboxysome formation
To analyse the possible interactions between (ApN)4 and other components of the pro-β-carboxysome, we performed pull-down assays of (SIIApN)4 after incubation with either (CM)3, Rubisco (RbcL_8_S_8_), (CA)4 or CM^Ct^ (residues 197–539). Only SIIApN was enriched in the pull-down fraction in each case as analysed by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS–PAGE) (Extended Data Fig. 2a), indicating that (ApN)4 does not stably interact with the other proteins of the pro-β-carboxysome condensate^30^.
We next took advantage of a sedimentation assay to test whether (ApN)4 can be recruited into biomolecular condensates with Rubisco and (CA)4 formed by the scaffolding proteins (CM)3 and CM^Ct^. It was previously shown that when (CM)3, CM^Ct^, Rubisco and (CA)4 were mixed, all four proteins co-assembled into droplet-shaped condensates and sedimented in the pellet fraction^30^ (Extended Data Fig. 2b, lanes 1–3). The concentration of ApN in Se7942 is thought to be low relative to the other constituents^49,50^. We first performed the sedimentation assay using 0.25 μM (ApN)4, but no ApN could be detected in the pellet fraction (Extended Data Fig. 2b, lanes 4–6). Increasing the concentration of (ApN)4 by 10-fold also failed to show co-sedimentation with the other four proteins (Extended Data Fig. 2b, lanes 7–9). To exclude the possibility that the amount of ApN recruited into the condensate is too low for detection by SDS–PAGE, we performed fluorescence microscopy (FM) using Alexa Fluor (AF)-labelled proteins: Rubisco_AF568_, (CM)3AF488, CM^Ct^AF488, (CA)4AF647 and (ApN)4AF405. The respective labelled proteins used at a 1:10 ratio to unlabelled protein confirmed that (ApN)4 is unable to co-localize into phase-separated droplets containing Rubisco, (CM)3, CM^Ct^ and (CA)4 (Fig. 1i; see Methods).
These findings suggest that tetrameric ApN does not interact with the pro-β-carboxysome core proteins, raising the question of how the essential ApN^16,34^ is recruited to trigger shell formation and β-carboxysome maturation.
Formation of ApN/CM hetero-complex
Previous in vivo studies showed that ApN interacts with the N-terminal γCAL domain of CM^34,35^, consistent with the recently reported hetero-trimeric crystal structure^36^ of (SUMO-ApN)2:CM(1–209). The ccmN gene is generally positioned downstream of ccmM in the ccm operon of β-cyanobacteria^5,6,29^ (Fig. 1b), suggesting functional linkage. As shown above, the ApN protomers can assemble into a tetramer upon recombinant expression in E. coli (Fig. 1d,e and Supplementary Table 1), while the CM protomers form trimers^30^. To test whether subunits of these complexes can exchange to form ApN/CM hetero-complexes, we performed pull-down assays on a mixture of (SIIApN)4 and (CM)3. Upon incubation in the presence of dithiothreitol (DTT), mimicking cell cytosol, hetero-complex formation was not detected (Extended Data Fig. 2c, lanes 1–6). Since the interior of the mature carboxysome is oxidizing^19–21^, we also performed the reaction under oxidizing conditions in the presence of 2 mM hydrogen peroxide (H_2_O_2_), but again, no hetero-complexes were observed (Extended Data Fig. 2c, lanes 7–12).
The absence of detectable subunit exchange between (CM)3 and tetrameric (ApN)4 suggested that the hetero-complex forms in a manner coupled to protein synthesis^51^. To generate the hetero-complex, we used a bicistronic plasmid for recombinant expression in which 10 histidines (H_10_) were fused to the C terminus of full-length ccmM (CM-H10) or the truncated gene (residues 1–198) lacking the SSUL modules (CM ^Nt^-H10), and a Strep-tag II to the N terminus of ccmN (SIIApN) (Fig. 2a). Strong ribosome binding sites (RBS) were inserted upstream of both genes. This is in contrast to the in vivo operon of Se7942, in which the RBS upstream of ccmN is much weaker than the RBS of ccmM, resulting in high concentrations of (CM)3 and CM^Ct^ and relatively low expression of ApN^49,50,52,53^. To isolate the hetero-complexes, we employed a two-step protocol under reducing conditions with nickel-nitrilotriacetic acid (Ni-NTA) chromatography as the first step, followed by a Strep–Tactin column (Extended Data Fig. 3a).Fig. 2. Biochemical and structural analysis of the ApN/CM hetero-complex.a, Schematic of the bicistronic plasmids for recombinant expression of hetero-complexes. b, SEC–MALS analysis of the purified complexes. The expected molecular masses for complexes containing three ApN protomers and one CM protomer are 112.69 kDa for SIIApN/CM-H_10_, 75.83 kDa for SIIApN/CM^Nt^-H_10_ and 106.81 kDa for ApN/CM. The measured molar masses are indicated. Representative data of 2 independent experiments are shown (n = 2). c, End-on view of the AF3 structural model of (ApN)3:CM in ribbon representation (N termini on top). ApN protomers, magenta; CM protomer, blue. The N-terminal residues of the ApN protomers and both the N- and C-terminal residues for the CM protomer are indicated. A to C walls of the ApN protomers labelled as in Fig. 1f, and the CM protomer walls are labelled D, E and F. The weak and strong interfaces between the ApN protomers and CM are marked by dashed boxes. The conserved cysteine residues on ApN walls C are shown as sticks (gold). d,e, Analysis of the strong ‘tongue and groove’ interface of the loop residues between walls D and F of CM with residues of ApN1 wall A. Interacting residues are shown in stick representation and H-bonding as dashed lines (d). Surface representation of residues involved at the strong interface exposed to the viewer (e). The conserved residues, L76 and T94 on wall A of the ApN1 are indicated by a dashed box. Residue colouring based on their hydrophobicity moment and charge. Hydrophobic, yellow; positive charge, blue; negative charge, red. f, Analysis of L76 and T94 point mutants. Top: schematic of the bicistronic plasmids for recombinant expression of the hetero-complexes and the workflow for pull-down experiments. Bottom: SDS–PAGE analysis with lane numbering corresponding to the numbering in the workflow. Representative data of 2 independent experiments are shown (n = 2).Source data
SEC–MALS in the presence of DTT measured molar masses of ~111.6 and ~77.9 kDa for the hetero-complexes SIIApN/CM-H_10_ and SIIApN/CM^Nt^-H_10_, respectively (Fig. 2b and Supplementary Table 1), suggesting that they contain three SIIApN protomers and one protomer of CM-H_10_ or CM^Nt^-H_10_ (theoretical molar masses of 112.69 kDa and 75.83 kDa, respectively). We also obtained relatively pure non-tagged hetero-complexes (Fig. 2a). SEC–MALS analysis revealed a hetero-complex of ~103.8 kDa (theoretical mass 106.81 kDa) (Fig. 2a,b and Supplementary Table 1), consistent with the mass of three ApN protomers and one CM protomer. AF3 was able to model the hetero-tetramer (with pTM and ipTM of 0.71 and 0.69, respectively), suggesting that the (ApN)3:CM complex is structurally plausible (Fig. 2c and Extended Data Fig. 3b,c). This stoichiometry is distinct from that of the previously reported hetero-trimer (SUMO-ApN)2:CM(1–209)^36^. In the hetero-trimer, wall C of ApN2 and wall D of the CM protomer are solvent exposed^36^. In the AF3, hetero-tetramer model wall A of the additional ApN (ApN3) interacts with the exposed wall C of ApN2, consistent with the hydrophobic interaction seen in the (ApN)4 homo-tetramer (Fig. 1f,g), while wall C of ApN3 (containing the conserved Cys49 and Cys88) faces wall D of CM. Moreover, there is a distinct gap between the ApN3 wall C and wall D of the CM protomer, suggesting that this interface in the hetero-tetramer is held only by weak interactions (Fig. 2c). Thus, this interface may have a destabilizing effect on the interface between ApN3 wall A and ApN2 wall C, facilitating dissociation of ApN3 and formation of the (ApN)2:CM hetero-trimer^36^.
On the other hand, the strong interface formed by the loops between walls D and F of the CM protomer with wall A of the ApN1 protomer on its right (Fig. 2c) is an extensive ‘tongue and groove’ interaction, involving hydrophobic and polar interactions, including several hydrogen bonds^36^ (Fig. 2d,e). We suggest that this interface possibly has a stabilizing effect on the hydrophobic interface between ApN1 wall C and ApN2 wall A (Fig. 2c). Since this interface is also likely to be critical for the formation of the hetero-tetramer, we mutated the highly conserved ApN residues Leu76 or Thr94 individually to the charged residues Asp or Arg (SIIApN-L76D, SIIApN-L76R, SIIApN-T94R) to impair the hydrophobic interactions and induce steric hindrance. As expected, upon co-expression with CM^Nt^-H_10_, the ApN mutants failed to assemble into hetero-tetramers (Fig. 2f).
In summary, we find that the ApN/CM hetero-complex assembled during co-translation in the reducing environment of the cytosol is a tetramer, consisting of three protomers of ApN and one CM protomer. The AF3 model of (ApN)3:CM indicates that the CM protomer forms a ‘strong’ interface with one of the ApN protomers via extensive ‘tongue and groove’ interactions and a ‘weak’ interface with another ApN protomer. While the ‘strong’ interface is required for the formation of the hetero-complex, the ‘weak’ interface may destabilize the hetero-tetramer to a hetero-trimer.
Cryo-EM single-particle analysis of the ApN/CM hetero-complex
To structurally confirm the tetrameric state of the ApN/CM hetero-complexes, we performed cryo-EM and single-particle analysis on both the non-tagged (ApN)3:CM and tagged (SIIApN)3:CM-H_10_ hetero-complexes. The top twenty 2D class averages for both (ApN)3:CM and (SIIApN)3:CM-H_10_ revealed one major class of dimers of hetero-trimers in an S-shaped open conformation (Fig. 3a and Extended Data Fig. 4a,b), similar to the S-shaped dimer-of-trimers of (SUMO-ApN)2:CM(1–209) in the crystallographic asymmetric unit^36^ (Fig. 3b and Extended Data Fig. 4c). Note that while the 2D classes of non-tagged (ApN)3:CM also contain tetramers of ApN (Extended Data Fig. 4a), no hetero-tetramers of either (ApN)3:CM or (SIIApN)3:CM-H_10_ (Supplementary Table 1), as identified in solution, were observed. The triangular-shaped β-helical barrel of CM is easily distinguished from the β-helical barrel of ApN by the presence of an additional density of its C-terminal α-helix that packs along wall F of the CM protomer^30^ (Fig. 3a,b). The SUMO-ApN/CM(1–209) hetero-complexes used for crystallization were trimeric in solution^36^, and it is conceivable that the (ApN)2:CM hetero-trimer arises due to the dissociation of the labile ApN subunit from the tetrameric (ApN)3:CM complex as discussed above. Formation of the dimer-of-trimers may then occur at high local concentrations, during crystallization^36^ or cryo-EM sample preparation (this study), driven by hydrophobic interactions (Fig. 1h) between the juxtaposed ‘C’ walls of ApN protomers (Fig. 3b). In the S-shaped cryo-EM 2D classes, there are two circular densities seen in the centre of each of the hetero-trimers, most probably representing the structured N-terminal segment of ApN1 as seen in the crystal structure^36^ (Fig. 3a,b and Extended Data Fig. 4c). The N-terminal residues of ApN2 and ApN3 are disordered in the AF3 model of the hetero-tetramer (Extended Data Fig. 3c) and also in the model of ApN2 in the crystal structure of the hetero-trimer^36^. Deletion of 10 residues following Met at the N terminus of ApN (ApNΔN) resulted in the formation of (ApNΔN)2:CM-H_10_ hetero-trimers (Supplementary Table 1), indicating that the structured N terminus of ApN1 is not required for stabilizing the hetero-trimer but might have a function in forming the hetero-tetramer.Fig. 3. Cryo-EM single-particle analysis of the ApN/CM hetero-complex.a, Analysis of (ApN)3:CM hetero-tetramers by cryo-EM. Shown are representative 2D class averages (bottom view) from Extended Data Fig. 4a. The C-terminal α-helix of the CM protomer is indicated by blue ovals. Scale bar, 50 Å. b,c, End-on bottom view of the asymmetric unit of the (SUMO-ApN)2:CM(1–209) crystal structure (PDB: 7D6C) in ribbon representation. The CM protomers are shown in blue and the ApN protomers in magenta (b). The N and C termini are indicated when possible and the C-terminal α-helix of CM is highlighted as in a. The highly conserved Cys49 and Cys88 on wall C of ApN are shown as sticks in gold. A close-up view of the black dashed box highlights the positions of the cysteines on the juxtaposed C walls of ApN2 and ApN2′ (c). d, Dissociation of labile ApN protomer (ApN3 in Fig. 2c) from the tetrameric (ApN)3:CM complex upon oxidation of cysteines. SEC–MALS analysis of (SIIApN)3:CM^Nt^-H_10_ or (SIIApN-2A)3:CM^Nt^-H_10_ complexes in the presence of 5 mM DTT (reducing, left panel) or after treatment with 5 mM H_2_O_2_ on ice for 60 min (oxidizing, right panel) (see Methods). The theoretical molecular masses of (SIIApN)3:CM^Nt^-H_10_, (SIIApN)2:CM^Nt^-H_10_, (SIIApN-2A)3:CM^Nt^-H_10_ and (SIIApN-2A)2:CM^Nt^-H_10_ hetero-complexes are 75.83, 58.12, 75.64 and 58.00 kDa, respectively. The measured molar masses are indicated. Representative data of 2 independent experiments are shown (n = 2). e,f, Cryo-EM single-particle analysis of cysteine mutants of ApN/CM hetero-complexes. Shown are representative 2D class averages of (ApN-2A)4:(CM)2 dimer of hetero-trimers (e) and (ApN-2S)2:CM hetero-trimers (f), extracted from Extended Data Fig. 5d,e, respectively. The C-terminal α-helix of CM is highlighted in each panel as in a. Scale bars, 50 Å.Source data
These findings suggest that under the conditions of cryo-EM analysis, the (ApN)3:CM complex could undergo rearrangement to (ApN)2:CM through the release of the weakly associated ApN3 subunit (Fig. 2c), followed by assembly to the S-shaped dimer-of-trimers.
Cysteine oxidation converts the (ApN)3:CM tetramer to (ApN)2:CM trimer
We next wondered whether a mechanism exists to regulate the dissociation of the labile ApN3 protomer from (ApN)3:CM (Fig. 2c) to form hetero-trimers. Considering that initial ApN/CM complex formation occurs in the reducing cytosol, while the mature carboxysome provides an oxidizing environment^19–21^, a plausible trigger is the oxidation of the conserved cysteines, Cys49 and Cys88, of ApN. The thiol groups of these cysteines are surface exposed on wall C of ApN3 that is adjacent to wall D of CM at the weak ApN–CM interface (Fig. 2c). To test the effect of cysteine oxidation on the integrity of the hetero-tetramer, we exposed (SIIApN)3:CM^Nt^-H_10_ complexes to oxidizing conditions (5 mM H_2_O_2_, 60 min on ice) followed by SEC–MALS analysis. Compared to the reduced hetero-tetrameric complex of molar mass ~77.9 kDa, the oxidized complex displayed a molar mass of ~60.1 kDa (Fig. 3d and Supplementary Table 1), indicative of a hetero-trimer with two protomers of SIIApN and one CM^Nt^-H_10_ protomer (theoretical mass 58.12 kDa). This is similar to the previously reported hetero-trimeric complex that was generated in the absence of reducing agent^36^. We thus speculate that cysteine oxidation may be a switch to facilitate the dissociation of the labile ApN3 protomer (Fig. 2c). Consistent with the SEC–MALS analysis, the hetero-tetramer (SIIApN)3:CM-H_10_ analysed upon oxidation by cryo-EM revealed only hetero-trimeric 2D classes (Extended Data Fig. 4d), suggesting that cysteine oxidation also interferes with the hydrophobic interaction between the juxtaposed ‘C’ walls necessary to form the dimer-of-trimers (Fig. 3b).
To confirm that the conformational change from hetero-tetramer to hetero-trimer is due to cysteine oxidation, we mutated Cys49 and Cys88 to alanine (ApN-2A) or serine (ApN-2S). All ApN-2A/CM complexes (ApN-2A/CM, SIIApN-2A/CM-H_10_, SIIApN-2A/CM^Nt^-H_10_) were hetero-tetrameric by SEC–MALS (Extended Data Fig. 5a and Supplementary Table 1). Upon treatment with H_2_O_2_, the alanine mutant complex (SIIApN-2A)3:CM^Nt^-H_10_ remained a tetramer, in contrast to SIIApN/CM^Nt^-H_10_ (Fig. 3d and Supplementary Table 1). Surprisingly, the ApN-2S/CM complexes (ApN-2S/CM, SIIApN-2S/CM-H_10_, SIIApN-2S/CM^Nt^-H_10_) populated hetero-trimers (Extended Data Fig. 5b and Supplementary Table 1). Analysis by native PAGE showed that the SIIApN-2S/CM^Nt^-H_10_ complex migrated faster than SIIApN/CM^Nt^-H_10_ and SIIApN-2A/CM^Nt^-H_10_ (Extended Data Fig. 5c), further indicating that the Cys to Ser mutation induced a transition from hetero-tetramer to hetero-trimer. Although both Ala and Ser are generally used to replace a redox-regulated Cys residue, they are distinct in their physical properties. Ala is smaller than Cys and contains a non-polar side chain (methyl group) that maintains the hydrophobicity of reduced Cys, while Ser is similar in size to Cys, containing a polar hydroxyl (OH) group which is more hydrophilic. Similarly, Cys oxidation would also increase hydrophilicity. This is consistent with the observation that Cys oxidation results in the conformational switch from hetero-tetramer to hetero-trimer, similar to the Cys to Ser mutation.
To obtain more insight into the structure of the Cys mutant complexes, we performed cryo-EM analysis of the tetrameric (ApN-2A)3:CM and trimeric (ApN-2S)2:CM. The (ApN-2A)3:CM complexes were again observed to rearrange and form the S-shaped dimer-of-trimers (Fig. 3e and Extended Data Fig. 5d), similar to the wild-type (ApN)3:CM complexes (Fig. 3a and Extended Data Fig. 4a), indicating that dimer-of-trimers formation is independent of Cys oxidation. This is consistent with the absence of disulfide bonds in the crystal structure^36^ (Fig. 3b,c). As expected, the ApN-2S/CM hetero-complexes revealed 2D classes of mainly hetero-trimers along with some tetramer-like structures which could not be unambiguously assigned (Fig. 3f and Extended Data Fig. 5e). The absence of S-shaped complexes (Extended Data Fig. 5e) suggests that the Cys to Ser mutation also interferes with the hydrophobic interactions between the juxtaposed ‘C’ walls necessary to form the dimer-of-trimers (Fig. 3b), similar to oxidation of the cysteines (Extended Data Fig. 4d).
In summary, the (ApN)3:CM hetero-tetramer is destabilized upon oxidation of the conserved cysteines of ApN, resulting in the release of one ApN protomer and formation of the (ApN)2:CM hetero-trimer. Probably, only the exposed cysteines of ApN3 are oxidized, while the cysteines at the wall A–C interface of ApN protomers are buried and cannot be oxidized. Thus, the conversion from hetero-tetramer to hetero-trimer and possibly to the dimer-of-trimers occurs during or after shell assembly when the interior of the carboxysome is oxidized.
In vivo complementation
Next, to test whether oxidation of the cysteines or hetero-complex formation is important in vivo, we generated a series of modified ApN forms for expression in a carboxysomeless mutant of Se7942. We employed the ΔccmK2LMNO mutant (hereafter ΔK2-O) in which these five genes of the ccm operon have been homologously replaced by the kanamycin-resistance marker, Kan^R^ (Extended Data Fig. 6a,b). ΔK2-O is carboxysomeless and high-CO_2_ requiring, and its growth in air can be recovered by re-introduction of ccmK2-O with the native promoter on plasmid pSE4ccmK2-O^54,55^ (Fig. 4a and Extended Data Fig. 6b,c). Complementation of ΔK2-O with plasmid pSE4ccmK2-O showed a moderately slower growth rate in air with doubling time of ~15 h, compared with ~7.5 h for wild type (Fig. 4b). Ultrastructure analysis of cyanobacterial cells revealed slightly larger carboxysomes in ΔK2-O + pSE4ccmK2-O than in wild type (Fig. 4c,d and Extended Data Fig. 6d). The cysteine mutants ApN-2A and ApN-2S (Extended Data Fig. 6b) were capable of growth at atmospheric CO_2_, as monitored by plate assay (Fig. 4a), but displayed significantly reduced growth rates in liquid culture (~27 h doubling time) (Fig. 4b), and accordingly have slightly larger carboxysomes than ΔK2-O + pSE4ccmK2-O cells (Fig. 4c,d). Our findings therefore suggest that the conserved cysteines in ApN have a role in vivo to ensure timely carboxysome maturation. We also note that mutant cell growth on plates did not always align with growth rates observed in liquid media (Fig. 4a,b). While we do not understand the full implications behind this phenomenon, there are perhaps differential rates of functional carboxysome formation in these two growth conditions.Fig. 4. In vivo assessment of the role of ApN in carboxysome assembly and cyanobacterial cell growth in air.a, Spot plate assay of ΔK2-O cyanobacterial cells and ΔK2-O complemented with wild-type (WT) K2-O or with the ApN-2A, ApN-2S, ApN-L76R or ApN-T94R mutants, grown in air at 30 °C for 21 days. Schematics of the ΔK2-O genome and shuttle vector pSE4ccmK2-O (K2-O) operons, and a representative plate from 3 replicates are shown (n = 3). b, Cell growth analysis in air at 30 °C (see Methods) of WT Se7942 cells, ΔK2-O cells complemented with WT K2-O or with the ApN-2A or ApN-2S was performed for 8 h, with the maximum growth rate determined from the slope and converted to doubling time. Data are mean ± s.d. of triplicate measurements (n = 3). *P < 0.05 (one-way ANOVA). c, Representative transmission electron micrographs of WT Se7942 cells, ΔK2-O cells and ΔK2-O complemented with either K2-O or K2-O/ApN mutants (ApN-2A, ApN-2S, ApN-L76R or ApN-T94R) grown in modified BG-11 medium at 2% CO_2_ at 30 °C for 7 days. Images are representative of carboxysomes (indicated by white arrowheads) in WT, ΔK2-O + K2-O, ApN-2A and ApN-2S cell lines. In ApN-L76R and ApN-T94R mutant lines, polar bodies (indicated by red arrowheads) are observed, while no internal ultrastructure is observed in the ΔK2-O knockout line. Scale bars, 500 nm. d, Analysis of width at the widest point of carboxysomes or polar bodies from images in c of WT Se7942 cells (n = 57), and ΔK2-O complemented with wild-type K2-O (n = 53) or with the ApN-2A (n = 58), ApN-2S (n = 56), ApN-L76R (n = 60) or ApN-T94R (n = 65) mutant proteins. The median (solid line), interquartile range (dotted lines, 25%–75%) and the distribution of the data are shown. **P < 0.01, ****P < 0.0001 (Tukey’s test).Source data
Next, to assess the role of conserved residues on wall A of ApN in carboxysome formation and function in vivo, we generated mutants ApN-L76R and ApN-T94R in the pSE4ccmK2-O plasmid (Extended Data Fig. 6b) and expressed these in ΔK2-O cells. Note that the respective ApN mutant proteins were defective in ApN/CM hetero-complex formation in vitro (Fig. 2f). Both ApN-L76R and ApN-T94R failed to support cell growth in ambient CO_2_ (Fig. 4a), indicating that they were unable to rescue the high-CO_2_-requiring phenotype of ΔK2-O cells. Ultrastructural analysis of the cells showed no formation of carboxysomes, and instead, indistinct polar bodies^55^ were detected with significantly larger dimensions than wild type or ΔK2-O + pSE4ccmK2-O carboxysomes (Fig. 4c,d and Extended Data Fig. 6d).
In summary, although the conserved cysteine residues C49 and C88 when mutated allowed carboxysome shell formation in vivo, they nevertheless play a role in the efficiency of the CO_2_-concentrating mechanism of the carboxysome. On the other hand, the residues of ApN wall A, stabilizing an extensive ‘tongue and groove’ interaction in the strong interface with CM in the hetero-complex (Fig. 2c), are critical for shell protein recruitment and carboxysome maturation.
(ApN)3:CM co-assembles with pro-carboxysome core proteins in vitro
To understand how the (ApN)3:CM hetero-complex is recruited into the pro-β-carboxysome biomolecular condensate, we first analysed its interaction with the other known constituents of the pro-carboxysome core under reducing conditions. The trimeric CM was shown to be involved in mediating multiple, dynamic protein–protein interactions^30,56^ (Fig. 5a), including (1) the interaction of SSUL modules with the γCAL domain of adjacent CM protomers (homo-demixing), (2) the interaction of SSUL modules with Rubisco and (3) the interaction of the C termini of (CA)4 with the γCAL domains of CM, resulting in condensate formation of each protein pair and all four proteins in combination^30,56^. (ApN)3:CM also underwent homo-demixing as monitored by turbidity assay, although higher concentrations of (ApN)3:CM were required for aggregate formation, consistent with a lower avidity due to the presence of only 3 SSUL modules compared with 9 SSUL modules in (CM)3 (Fig. 5b). FM analysis confirmed the formation of phase-separated droplets of (ApN)3:CM or (CM)3 (Fig. 5c,d). When mixed with (CM)3, droplet formation occurred at a lower concentration (Fig. 5e), where no homo-demixing was observed for the individual proteins. Turbidity assays also confirmed aggregate formation of the two proteins, while no absorbance signal was detected between (CM)3 and (ApN)4 (Extended Data Fig. 7a). In the absence of SSUL modules in (ApN)3:CM^Nt^, neither homo-demixing nor formation of droplets was detected (Fig. 5b,f). This is consistent with previous findings that the interaction of SSUL modules with γCAL domains of adjacent CM subunits is required for homo-demixing^30^.Fig. 5. Condensate formation of the (ApN)3:CM hetero-complex.a, Interactions of (CM)3. (i) (CM)3 homo-condensate formation via interaction of SSUL modules (gold) with γCAL domains (blue) of adjacent (CM)3 complexes. (ii) Interaction of the SSUL modules of (CM)3 with Rubisco (green). (iii) Interaction of the C terminus of (CA)4 (orange) with γCAL domains of (CM)3, simultaneously with the binding of SSUL modules^30^. Right: condensate of (CM)3/CM^Ct^ with Rubisco and (CA)4 via multiple protein–protein interactions. b, Concentration dependence of (ApN)3:CM (6–8 μM) homo-condensate formation analysed by turbidity assay in buffer containing 50 mM KCl and 5 mM DTT. (CM)3 (5–7 μM) was also analysed. Data are mean ± s.d. of triplicate measurements (n = 3). c–f, (ApN)3:CM and (CM)3 co-localize into droplet-shaped condensates. Analysis of homo-condensate formation of (ApN)3:CM at 6 and 8 μM (c), and (CM)3 at 3 and 6 μM (d). Co-condensate formation of (ApN)3:CM (4 μM) and (CM)3 (3 μM) (e). Shown as a control is the (ApN)3:CM^Nt^ complex (8 μM) (f). (ApN)3:CM and (ApN)3:CM^Nt^ complexes were labelled with AF405, while (CM)3 was labelled with AF488. Labelled proteins were used at a 1:10 ratio with unlabelled protein. Representative data from 2 independent experiments are shown (n = 2). Scale bars, 10 μm. g, Apparent binding affinity \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${(K}_{{\rm{D}}}^{\mathrm{app}})$$\end{document} of (ApN)3:CM to Rubisco. Turbidity was measured using 0.25 μM Rubisco and 0–2.5 μM (ApN)3:CM, 0–0.3 μM (CM)3 or 0–2.5 μM CM^Ct^ in buffer containing 100 mM KCl and 5 mM DTT. Absorbance values reached after 10 min are plotted. Data are mean ± s.d. of triplicate measurements (n = 3). h, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}^{\mathrm{app}}$$\end{document} of (ApN)3:CM or (CM)3 to (CA)4. Turbidity was measured as in g using 0.5 μM (CA)4 and 0–4 μM (ApN)3:CM or 0–0.45 μM (CM)3. Inset: expanded view of the titration of (CM)3. Data are mean ± s.d. of triplicate measurements (n = 3). a.u., arbitrary units.Source data
The SSUL modules of (CM)3 also interact with Rubisco by binding in a groove between RbcL dimers^30,56^. As expected, (ApN)3:CM has a ~6-fold lower apparent binding affinity for Rubisco ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}^{\mathrm{app}}$$\end{document} ≈ 0.4 μM) than (CM)3 ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}^{\mathrm{app}}$$\end{document} ≈ 0.07 μM) (Fig. 5g), and both proteins co-localized into phase-separated droplets (Extended Data Fig. 7b). The affinity of (ApN)3:CM for Rubisco is similar to that of CM^Ct^ ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}^{\mathrm{app}}$$\end{document} ≈ 0.4 μM) (Fig. 5g), which also consists of only three SSUL modules. In the absence of either SSUL modules in (ApN)3:CM^Nt^ or in (ApN)4, no interaction with Rubisco was observed (Extended Data Fig. 7c).
In the interaction between (CM)3 and (CA)4 (Fig. 5a), the conserved C-terminal sequence of CA binds in a pocket of the γCAL domain of CM^30^, resulting in an apparent affinity of ~0.2 μM for (CM)3 binding to (CA)4 (Fig. 5h). An ~7-fold lower apparent affinity was observed for the interaction between (ApN)3:CM and (CA)4 ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}^{\mathrm{app}}$$\end{document} ≈ 1.4 μM), consistent with the presence of only one γCAL domain in the hetero-complex (Fig. 5h). FM analysis confirmed that both proteins co-assembled into phase-separated droplets (Extended Data Fig. 7d). No absorbance signal was recorded with (ApN)4 or when the conserved C-terminal peptide in CA was deleted (CAΔC)4 (Extended Data Fig. 7e). Moreover, no turbidity was observed between (CA)4 and (ApN)3:CM^Nt^ (Extended Data Fig. 7e), consistent with previous findings that condensate formation involves not only the binding of the C-terminal sequences of (CA)4 to the γCAL domain of CM, but also the interaction of the SSUL modules with the γCAL domain of adjacent CM protomers^30^.
To analyse and confirm the subunit stoichiometry of (CA)4 and (ApN)3:CM complexes, we performed isothermal titration calorimetry (ITC). To avoid condensate formation during ITC measurements, we used the construct (ApN)3:CM^Nt^. Analysis in the presence of the reducing agent TCEP (1 mM) revealed a binding affinity ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}$$\end{document} ) of ~2.25 μM and a stoichiometry of 1:4 for (CA)4 to (ApN)3:CM^Nt^ (Extended Data Fig. 7f), supporting the previous finding that the Se7942 CA enzyme is tetrameric^30^. We also analysed the interaction between (ApN)3:CM^Nt^ and the monomeric construct E_GFP_CA_C17_ (ref. ^30^), in which the last 17 residues of CA (CA_C17_) are attached to the C terminus of enhanced green fluorescent protein (E_GFP_) via a flexible linker. The E_GFP_CA_C17_ bound to (ApN)3:CM^Nt^ with a much higher \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${K}_{{\rm{D}}}$$\end{document} of ~24 μM and at a stoichiometry of 1:1 (Extended Data Fig. 7g), confirming the presence of only one CM protomer in the hetero-complex.
Based on these findings, the reduced (ApN)3:CM can homo-demix and also interact with the individual pro-carboxysome core proteins, (CM)3, Rubisco and (CA)4, only via the CM protomer. In contrast to (CM)3, the (ApN)3:CM hetero-complex containing only one CM protomer has a lower affinity for Rubisco or (CA)4.
(ApN)3:CM incorporates into the periphery of the pro-β-carboxysome condensate
We recently showed that the four proteins (CM)3, CM^Ct^, Rubisco and (CA)4 co-localized into biomolecular condensates in vitro, with (CM)3 being the central organizer^30^ (see also Fig. 1i). Here, to better understand β-carboxysome biogenesis, we analysed the step of pro-carboxysome assembly at which the (ApN)3:CM hetero-complex is included, given that the C-terminal EP of ApN is involved in the interaction with shell proteins^16,33,34,37,38^. The stoichiometry of the essential core components within a functional β-carboxysome remains unclear^31,49,50,57,53^. However, the consensus is that Rubisco is the main component with a concentration of ~1 mM (~600 molecules per carboxysome), followed by the scaffolding proteins CM^Ct^ and (CM)3, while the carbonic anhydrase enzyme (CA)4 and the shell adaptor protein ApN are of low abundance^50^. We thus speculated that there must be a stepwise assembly pathway to ensure that ApN is positioned at the periphery of the pro-carboxysome for binding to shell proteins^16,33,34,37,38^.
To test this, we first mixed the three main components, Rubisco, CM^Ct^ and (CM)3, on the assumption that they mediate the initial step in pro-β-carboxysome biogenesis^16,31,57^, and confirmed that all three proteins co-assembled into droplet-shaped condensates by FM (Fig. 6a). Fluorescence recovery after photobleaching (FRAP) of the droplets showed that Rubisco and (CM)3 were essentially immobile in the condensate, while CM^Ct^ displayed some mobility (Extended Data Fig. 8a–c). Note that at the low concentrations used, no condensate formation was observed upon mixing CM^Ct^ and (CM)3 in the absence of Rubisco (Extended Data Fig. 8d). If the initial condensate is first formed by mixing Rubisco and CM^Ct^, subsequent addition of (CM)3 resulted in a distinct ring of (CM)3 around the homogeneous Rubisco/CM^Ct^ condensate (Extended Data Fig. 8e), indicating that entry of (CM)3 was restricted. Note that the Rubisco in the Rubisco/CM^Ct^ condensate is immobile, while CM^Ct^ is mobile^30^. In contrast, the condensate containing Rubisco/CM^Ct^/(CM)3 allowed the small molecule CM^Ct^, but not (CM)3, to freely diffuse (Extended Data Fig. 8f), consistent with gel-like properties of the condensate.Fig. 6. Sequential formation of pro-β-carboxysome condensates.a–h, Condensate formation analysed by FM under reducing conditions and 100 mM KCl. The experimental scheme, protein concentrations and fluorophores used are indicated for each respective co-assembly experiment. Labelled proteins were used at 1:10 ratio with unlabelled protein. Representative data from 2 or 3 independent experiments are shown (n = 2 or 3). Insets: magnified condensates. Scale bars, 10 μm, 5 μm (insets). The distribution of (ApN)3:CM in the condensate was obtained by measuring the fluorescence signal of AF405 across 4 random droplets. Right: the magenta intensity profile graphs represent the signal distribution across the magenta line of the indicated droplet (left) (for Fig. 6c,d, see Extended Data Fig. 9d,e, respectively).Source data
The next component added was (CA)4, which interconverts HCO_3_^−^ and CO_2_ within the oxidizing environment of the carboxysome^19–21^. The (CA)4 localized to the rim of the droplet, independent of whether (CM)3 was evenly distributed in the initial condensate or at the periphery (Fig. 6b and Extended Data Fig. 9a), suggesting that there were enough (CM)3 molecules exposed at the outer rim for (CA)4 to bind via the interaction of the C-terminal C2 peptide of (CA)4 with the γCAL domains of (CM)3. If the scaffolding proteins, CM^Ct^ and (CM)3, were added to a mixture of Rubisco and (CA)4, the (CA)4 was evenly distributed in the condensate^30^ (Extended Data Fig. 9b).
Next, to explore the incorporation of the (ApN)3:CM hetero-complex, we first added (ApN)3:CM to the homogeneous condensate containing Rubisco/CM^Ct^/(CM)3 but no (CA)4. Similar to (CA)4 (Fig. 6b and Extended Data Fig. 9a), (ApN)3:CM incorporated into the rim of the Rubisco/CM^Ct^/(CM)3 condensate regardless of whether (CM)3 was evenly distributed or enriched at the periphery (Fig. 6c,d and Extended Data Fig. 9c). Analysis of the (ApN)3:CM fluorescence profile across the droplets showed higher fluorescence intensity at the condensate rim compared with the core (Fig. 6c,d and Extended Data Fig. 9c–e). Next, we added (ApN)3:CM to the condensate with (CA)4 already localized at the periphery. As expected, the (ApN)3:CM co-localized and accumulated at the rim with (CA)4, and its distribution was again independent of (CM)3 localization (Fig. 6e,f). However, when (CA)4 was homogeneously distributed, the (ApN)3:CM hetero-complex was only weakly concentrated at the droplet rim relative to the core (Extended Data Fig. 9f). This observation contrasts with the behaviour of (ApN)3:CM in the absence of (CA)4, localizing efficiently at the rim of the condensate (Fig. 6c,d and Extended Data Fig. 9c–e), suggesting that (CA)4 in the core recruits (ApN)3:CM, thereby diminishing its accumulation at the periphery. If (CA)4 was added after (ApN)3:CM, it efficiently localized to the rim while (ApN)3:CM was displaced to the core (Fig. 6g). This effect was not observed when (CA)4 and (ApN)3:CM were added simultaneously (Fig. 6h). Note that the localization of proteins at the rim remained stable for at least 30 min in the various phase-separated droplets.
In summary, for (ApN)3:CM to be efficiently localized to the condensate rim, (CA)4 incorporation has to occur after formation of the initial Rubisco/CM^Ct^/(CM)3 condensate, with (ApN)3:CM being added either simultaneously with or after (CA)4. Thus, the in vitro reconstitution experiments reveal a plausible pathway for sequential pro-β-carboxysome assembly that places the shell adaptor hetero-complex (ApN)3:CM at the periphery of the condensate. This position would then allow the C-terminal encapsulation peptide of ApN to bind to various shell proteins and trigger the final step in the assembly of the mature β-carboxysome (Fig. 7).Fig. 7. Stages in pro-β-carboxysome condensate formation and carboxysome maturation.This model outlines a sequential assembly pathway that could be mimicked in chloroplasts to guide the formation of functional β-carboxysomes, with the (ApN)3:CM complex serving as a key determinant of shell proteins recruitment and spatial organization. In stage 1, the highly abundant Rubisco protein is sequestered via multivalent binding of the SSUL modules of the scaffolding proteins (CM)3 and CM^Ct^, forming the initial condensate. In stage 2, the low-abundance tetrameric CA enzyme and the (ApN)3:CM hetero-tetramer are recruited to the rim of the stage 1 condensate, via multiple low-affinity interactions, involving binding of C2 peptides of (CA)4 to the γCAL domains of (CM)3 (i) and to the γCAL domain of (ApN)3:CM hetero-complex (ii), and binding of SSUL modules of (CM)3 to the γCAL domain of the CM protomer in the hetero-complex (iii) and SSUL modules of the CM protomer in (ApN)3:CM with Rubisco (iv). The unresolved final stage of β-carboxysome maturation involves binding of the C-terminus EP domain of the ApN protomers in the hetero-complex (ApN)3:CM to various shell proteins and assembly of the icosahedral shell.
Discussion
Carboxysomes are critical for cyanobacterial growth and represent a viable biotechnological innovation for plant photosynthetic enhancement^9,58^. Here we propose a mechanism for the stepwise assembly of the pro-β-carboxysome, a process of biomolecular condensate formation that precedes shell formation. The (ApN)3:CM hetero-complex is functionally critical and must localize at the pro-carboxysome rim for shell assembly. In the first stage of pro-β-carboxysome assembly, the highly abundant Rubisco protein is sequestered via multivalent binding of the SSUL modules of the scaffolding proteins (CM)3 and CM^Ct^, forming the initial condensate (Fig. 7, stage 1). In the second stage, the low-abundance carbonic anhydrase (CA)4 and (ApN)3:CM complexes are recruited to the condensate rim (Fig. 7, stage 2), provided that (CA)4 binds before or simultaneously with (ApN)3:CM (Fig. 6e–h and Extended Data Fig. 9f). Stage 2 involves four types of interaction: (ⅰ,ⅱ) binding of the C-terminal sequence (C2) of (CA)4 to the γCAL domains of (CM)3 and to the γCAL domain of CM in (ApN)3:CM, (ⅲ) binding of SSUL modules of (CM)3 to the γCAL domain of CM in (ApN)3:CM, and (ⅳ) binding of SSUL modules of CM in (ApN)3:CM to Rubisco^30^ (Fig. 7, stage 2). The (ApN)3:CM complex, located at the periphery of the condensate, then exposes the encapsulation peptide of ApN for interaction with various shell proteins (Fig. 7, stage 3). While the mechanism of shell assembly remains to be investigated, the peripheral localization of (ApN)3:CM has a key role in this process. In the context of engineering β-carboxysomes into chloroplasts, ApN_3_:CM thus represents a modular adaptor that must be co-expressed with Rubisco and CA.
When expressed independently in E. coli, ApN forms a tetramer in solution, incapable of interacting with the other pro-carboxysome core proteins (Extended Data Fig. 2). Given the position of the ccmN gene directly downstream of ccmM within the ccm operon, our co-expression studies revealed that the N-terminal 5-turn β-helical barrel of ApN interacts with the 7-turn barrel of the γCAL domain of CM, forming hetero-complexes (Fig. 2a–c). The hetero-complex formed under reducing conditions is predominantly a tetramer consisting of three ApN protomers and one CM protomer, (ApN)3:CM. Hydrophobic interactions appear to be key in driving hetero-tetramer formation through the interface between ApN wall A and the loops between walls D and F of CM, an extensive ‘tongue and groove’ interaction (Fig. 2c–e). Mutations impairing hydrophobic interactions at this interface prevented hetero-complex formation (Fig. 2f) resulting in the failure of shell assembly and a high CO_2_ requirement for cell growth in vivo (Fig. 4a,c,d and Extended Data Fig. 6c,d). This indicated that ApN must form the hetero-complex with CM to be incorporated into the pro-carboxysome. These findings are consistent with previous results showing that deletion of the ccmN gene arrests carboxysome biogenesis at the pro-carboxysome stage^34^.
Cryo-EM analysis of (ApN)3:CM revealed dimers of hetero-trimers in an S-shaped open conformation (Fig. 3a), suggesting that hetero-tetramers are converted to hetero-trimers through the release of the ApN subunit that weakly interacts with CM (Fig. 2c). Such a rearrangement may occur during carboxysome maturation, which is known to coincide with a change from reducing to oxidizing conditions^19–21^. Indeed, we found that oxidation of the conserved cysteines Cys49 and Cys88 in ApN regulated the dissociation of the labile ApN protomer from the tetrameric (ApN)3:CM to form hetero-trimers (Fig. 3d). Mutating cysteines to alanine (ApN-2A) or serine (ApN-2S) resulted in various hetero-oligomers. ApN-2A formed a hetero-tetramer with CM in solution (Supplementary Table 1) and the S-shaped dimer-of-trimers by cryo-EM (Fig. 3e and Extended Data Fig. 5d), similar to wild-type ApN (Fig. 3a and Extended Data. Fig. 4a), while ApN-2S formed hetero-trimers in solution (Supplementary Table 1) and in cryo-EM, with no S-shaped complexes observed (Fig. 3f and Extended Data Fig. 5e). Thus, the Cys to Ser mutation appears to interfere with hydrophobic contacts necessary for forming the hetero-tetramer in solution and the dimer-of-trimers in cryo-EM analysis. Importantly, in the Cys to Ala/Ser mutants, all ApN subunits are mutated, while in the wild-type complex in vivo, probably only the exposed cysteines of the labile ApN subunit are susceptible to oxidation. The importance of redox regulation of the cysteines is less pronounced, as cells with either the Ala and Ser mutations still formed carboxysomes, albeit somewhat larger in size and at a 2-fold slower rate than in wild type (Fig. 4b–d). Thus, these findings highlight the role of the conserved cysteines in redox regulation, facilitating the conformational switch from hetero-tetramer to hetero-trimer.
The hetero-tetrameric (ApN)3:CM complex described here, which forms under reducing conditions, is distinct from the previously reported hetero-trimer complex, (SUMO-ApN)2:CM(1–209), which was obtained under non-reducing conditions^36^. We note that despite differences in their hetero-oligomeric state, both complexes may function as shell protein adaptors at distinct stages of assembly of the β-carboxysome. Moreover, the dimer-of-trimers seen upon cryo-EM analysis of (ApN)3:CM and crystallization of (SUMO-ApN)2:CM(1–209)^36^ possibly represents the end-state of this conversion process, although attempts to observe it in solution have not been successful. The dimer-of-trimers could provide increased avidity of the EP sequence of ApN for recruiting shell proteins and thereby enhance carboxysome maturation.
The scaffolding proteins (CM)3 and CM^Ct^ together with Rubisco and (CA)4 co-localize into biomolecular condensates, with (CM)3 acting as the central organizer^30^. We found the assembly and spatial organization of pro-carboxysomes to be dependent on the order of incorporation of core proteins. Rubisco, CM^Ct^ and (CM)3 form the initial condensate, followed by the addition of (CA)4 and (ApN)3:CM, both of which localize to the edge of the condensate (Fig. 7). Importantly, (ApN)3:CM must bind simultaneously with or after (CA)4 to be efficiently positioned at the condensate rim, and localizing the C-terminal encapsulation peptide of ApN at the periphery may then trigger shell assembly for β-carboxysome maturation. While in ~30% of cyanobacterial species, such as Thermosynechococcus elongatus BP-1 (ref. ^20^), the γCAL N-terminal domain of CM is a functional CA, this is not the case in Se7942 or Synechocystis PCC6803 (Syn6803)^59,60^, which have separate genes for the CA enzyme. CA-deficient Se7942 (refs. ^61,62^) or Syn6803 (ref. ^59^) contain carboxysomes similar in morphology and abundance to wild type, implying that CA was not required for incorporation of (ApN)3:CM and subsequent shell formation, although high CO_2_ was required for growth due to the absence of CA^59,61,62^. In our in vitro reconstitution of the pro-carboxysome of Se7942, we find that the recruitment of the (ApN)3:CM is most efficient when CA is at the rim of the initial condensate (Fig. 6e–h and Extended Data Fig. 9f). Although the subcompartment localization of CA in β-carboxysomes is under debate^35,57,63^, enriching CA at the periphery of the pro-carboxysome will allow more efficient conversion of HCO_3_^−^ to CO_2_ as it enters through the carboxysome shell. Such a localization of CA was suggested upon structural analysis of α-carboxysomes^64,65^. Moreover, (CM)3 is probably enriched at the periphery of the pro-carboxysome as well, based on evidence that the γCAL domain of CM is involved in shell protein interaction^35,57^. Indeed, we find that in the absence of CA, recruitment of (ApN)3:CM by (CM)3 is as efficient as when CA is present at the rim of the condensate (Fig. 6c–f and Extended Data Fig. 9c–e), probably due to the high concentration of SSUL modules in (CM)3 that facilitate binding to the γCAL domain of the CM protomer. When the γCAL domain of CM is not a functional CA, as in Se7942, pro-carboxysome assembly must be regulated to ensure that the CA enzyme is incorporated into the pro-carboxysome before shell formation, which is triggered by the ApN/CM hetero-complex. Future research will be required to reveal the oligomeric state of the ApN/CM hetero-complex in the functional carboxysome.
Towards reconstruction of the β-carboxysome in plants, our results highlight that co-expression of ApN and CM, forming a hetero-complex, is required to enable the formation of appropriate intermediate structures mediating shell formation. Dead-end homo-tetramer formation by ApN will probably lead to defective carboxysome formation, suggesting that fine-tuning of its expression in foreign hosts must be considered in achieving functional carboxysomes. The identification of cysteine residues in ApN that upon oxidation result in the conversion of the hetero-tetramer (ApN)3:CM to a hetero-trimer further underscores the importance of redox-sensitive regulation in carboxysome assembly, consistent with the known redox requirements for CA functionality^20^. Furthermore, our findings suggest that the coordinated assembly observed in vitro (Fig. 6 and Extended Data Fig. 9) will need to be at least partially recapitulated in the plant chloroplast. This would be more reliably achieved by co-expression of both genes from within the chloroplast genome than upon expression of nuclear genes and protein import into chloroplast with cleavable transit peptides. Recapitulating this assembly sequence in chloroplasts may overcome current bottlenecks in β-carboxysome engineering and realize functional CO_2_-concentrating microcompartments that enhance photosynthetic efficiency in C_3_ crops.
Methods
Strains
E. coli DH5α (ThermoFisher) cells were used for amplification of plasmid DNA. Positive clones were selected and cultivated in LB medium at 37 °C for 8 h. E. coli BL21 (DE3) (Agilent) was used for recombinant protein expression (see Method details).
The cyanobacterium S. elongatus PCC7942 (Se7942) (Institut Pasteur Paris) was used to obtain genomic DNA of ccmN. Se7942 was cultured in BG-11 medium at 30 °C and 50 rpm in continuous light.
Plasmids
Oligos used for amplification and generation of plasmids are listed in Supplementary Tables 2 and 3.
Genomic DNA
The genomic DNA of Se7942 was obtained as previously described^30^.
Plasmids
pHUE-SeApN was generated by amplification of the full-length ccmN from genomic DNA of Se7942 and subsequent cloning into pHUE (His6-ubiquitin at the N terminus) via Gibson assembly^66^. pET28a-StrepII-SeApN was generated by amplification of the full-length SeccmN from pHUE-SeApN with Strep-tag II (GSWSHPQFEKGSG) inserted after the initial methionine, and subsequent cloning into pET28a.
SeccmM was amplified from pHUE-SeM58 (ref. ^30^) and inserted into pET28a with or without C-terminal fused 10His-tag (GSGGSHHHHHHHHHH) to generate pET28a-SeCM-His10 or pET28a-SeCM, respectively. StrepII-SeApN or His6ubi-SeApN with the preceding RBS from pET28a-StrepII-SeApN or pHUE-SeApN was then inserted C terminally into pET28a-SeCM-His10 or pET28a-SeCM to obtain bicistronic pET28a-SeCM-His10-StrepII-SeApN or pET28a-SeCM-His6ubi-SeApN, respectively, via PCR and Gibson assembly. The full-length CM fragment in both bicistronic vectors was replaced by its first 198 residues (1–198) to generate pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN and pET28a-SeCM^Nt^(1–198)-His6ubi-SeApN.
StrepII-ApN fragment in pET28a-SeCM-His10-StrepII-SeApN was replaced by the full-length ApN or ApN lacking the first 10 residues (2–11) after the initial methionine, via PCR and Gibson assembly, to generate pET28a-SeCM-His10-SeApN or pET28a-SeCM-His10-SeApN(ΔN10), respectively.
Point mutations in SeApN were introduced by Gibson assembly to generate the following plasmids: pET28a-SeCM-His6ubi-SeApN-C49A/C88A, pET28a-SeCM-His6ubi-SeApN-C49S/C88S, pET28a-SeCM-His10-StrepII-SeApN-C49A/C88A, pET28a-SeCM-His10-StrepII-SeApN-C49S/C88S, pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN-C49A/C88A, pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN-C49S/C88S, pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN-L76D, pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN-L76R and pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN-T94R.
pUC18-ΔKO was generated through PCR amplification of the native upstream (A) and downstream (B) flanking regions (~1,000 bp) of Se7942 ccmKLMNO genes in the ccm operon, including EcoRI and BamHI, or BamHI and HindIII sites 5’ to 3’ in flanks A and B, respectively (Extended Data Fig. 6a,b). These fragments were A-tailed and ligated into pGem-T. Flank A was cloned into pUC18-Amp^R^ to generate pUC18-FlankA, and flank B was cloned into pUC18-FlankA to generate pUC18-FlankAB. The Kan^R^ cassette from pUC4K was isolated using BamHI and cloned directly into pUC18-FlankAB to generate pUC18-ΔKO.
pSE4-ccmKLMNO was generated from the synthesized ccmKLMNO fragment with its native promoter and flanking NcoI and XbaI sites into the corresponding sites of pSE4, replicating the coding sequence of the Se7942 ccm operon. A single additional NheI site (in ccmM gene) was eliminated during synthesis to enable assembly.
Protein expression and purification
SeRubisco^67,68^, SeCM^Ct^ (ref. ^56^), SeCM^30^ and SeCA proteins^30^ were expressed and purified as previously described.
ApN
ApN was expressed and purified from E. coli BL21 (DE3) cells harbouring pHUE-SeApN. Briefly, cells were grown in LB medium at 37 °C with shaking at 130 rpm until OD_600_ 0.4–0.5. Cells were equilibrated to 18 °C (~1 h) and protein expression induced by addition of 0.2 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) for 14 h at 120 rpm. Cells were collected and incubated in buffer A (50 mM Tris-HCl pH 8.0, 300 mM NaCl, 20 mM β-ME, 5% glycerol) containing 20 mM imidazole, 1 g l^−1^ lysozyme, 2.5 U ml^−1^ SmDNAse and complete protease inhibitor cocktail (Roche) for 1 h before lysis using EmulsiFlex C5 (Avestin). After centrifugation (40,000 × g, 30 min, 4 °C), the supernatant was loaded onto a HisTrap HP column (Cytiva), equilibrated with 10 column volumes (CV) buffer A and 20 mM imidazole. Bound protein was eluted with a 20 CV gradient (20–500 mM imidazole). The H_6_Ub moiety was cleaved using H_6_-Usp2 overnight at 4 °C^66^. Cleaved protein was buffer exchanged on a HiPrep 26/10 desalting column (GE) to buffer A and then applied to the HisTrap HP column to trap H_6_-Usp2, the cleaved H_6_Ub moiety and uncleaved protein. The flowthrough was concentrated to ~5 ml and applied to a HiLoad 26/600 Superdex 200 column (GE) equilibrated in buffer B (50 mM Tris-HCl pH 8.0, 300 mM KCl, 5 mM DTT, 10% glycerol). Protein-containing fractions were concentrated and applied to the size-exclusion column (HiLoad 26/600 Superdex 75, GE) equilibrated in buffer B. Finally, the pure protein was concentrated by ultrafiltration using Vivaspin MWCO 10000 (GE), aliquoted and flash frozen in liquid N_2_.
SIIApN
SIIApN was expressed in E. coli BL21 (DE3) cells harbouring pET28a-StrepII-SeApN as described for ApN. Clarified cell lysates were loaded onto a gravity-flow Strep–Tactin XT column (IBA), equilibrated and washed with 5 CV buffer A. The bound protein was eluted with buffer A/50 mM biotin and applied to two size-exclusion chromatography columns (HiLoad 26/600 Superdex 200 and HiLoad 26/600 Superdex 75, GE) in tandem equilibrated in buffer B. Fractions containing SIIApN were concentrated, aliquoted and flash frozen in liquid N_2_.
SIIApN/CM-H10, SIIApN/CMNt-H10 and mutants
SIIApN/CM-H_10_ was expressed in E. coli BL21 (DE3) cells harbouring pET28a-SeCM-His10-StrepII-SeApN as described for ApN. Clarified cell lysates were loaded onto a gravity-flow nickel-nitrilotriacetic acid (Ni-NTA) metal affinity column (QIAGEN), equilibrated with buffer A/50 mM imidazole and washed with 5 CV of buffer A/50 mM imidazole. Unbound proteins in the flowthrough and washes were loaded onto the gravity-flow Strep–Tactin XT column to obtain the StrepII-tagged proteins (SIIApN). The Ni-NTA column bound protein was eluted with buffer A/500 mM imidazole and applied to a gravity-flow Strep–Tactin XT column equilibrated with buffer A. The flowthrough and 5 CV washes of buffer A contained only His10-tagged proteins (CM-H_10_/CM^Ct^-H_10_). SIIApN/CM-H_10_ complex bound to the Strep–Tactin XT column was eluted with buffer A/50 mM biotin and applied onto the HiLoad 26/600 Superdex 200 column (GE) equilibrated in buffer B. His10-tagged proteins (CM-H_10_/CM^Ct^-H_10_) from the flowthrough and wash were further purified via size-exclusion chromatography in high-salt buffer (50 mM Tris-HCl pH 8.0, 500 mM KCl, 5 mM DTT, 10% glycerol). Purified proteins were concentrated and frozen as described for ApN.
SIIApN/CM^Nt^-H_10_ was expressed in E. coli BL21 (DE3) cells harbouring pET28a-SeCM^Nt^(1–198)-His10-StrepII-SeApN and purified as described for SIIApN/CM-H_10_, except that only the His10-tagged proteins (CM^Nt^-H_10_) were purified by size-exclusion chromatography (HiLoad 26/600 Superdex 200, GE) in buffer (50 mM Tris-HCl pH 8.0, 150 mM KCl, 10% glycerol).
SIIApN/CM-H_10_ mutants and SIIApN/CM^Nt^-H_10_ mutants were expressed in E. coli BL21 (DE3) cells harbouring the respective plasmids (Supplementary Table 3) and purified as described for SIIApN/CM-H_10_ and SIIApN/CM^Nt^-H_10_, respectively.
ApN/CM-H10 and ApNΔN/CM-H10
ApN/CM-H_10_ and ApNΔN/CM-H_10_ were expressed in E. coli BL21 (DE3) cells harbouring pET28a-SeCM-His10-SeApN and pET28a-SeCM-His10-SeApN(ΔN10), respectively, as described for ApN. Clarified cell lysates were loaded onto a gravity-flow Ni-NTA metal affinity column (QIAGEN), equilibrated with buffer A/50 mM imidazole and washed with 5 CV of buffer A. Protein was eluted with buffer A/500 mM imidazole. Fractions enriched in ApN/CM-H_10_ or ApNΔN/CM-H_10_ were pooled. After size-exclusion chromatography (HiLoad 26/600 Superdex 200, GE) and equilibration in buffer B, purified proteins were concentrated and frozen as described for ApN.
ApN/CM, ApN/CMNt and mutants
ApN/CM and ApN/CM^Nt^ were expressed in E. coli BL21 (DE3) cells harbouring pET28a-SeCM-His6ubi-SeApN and pET28a-SeCM^Nt^-His6ubi-SeApN, respectively, and purified essentially as described for ApN. Only H_6_Ub-ApN/CM complex or H_6_Ub-ApN/CM^Nt^ complex enriched fractions in the eluate of the HisTrap HP column (Cytiva) were pooled. The H_6_Ub moiety was cleaved and all proteins with H_6_ were removed as described for ApN. After size-exclusion chromatography (HiLoad 26/600 Superdex 200, GE) and equilibration in buffer B, purified proteins were concentrated and frozen as described for ApN.
ApN/CM mutants were expressed in E. coli BL21 (DE3) cells harbouring the respective plasmids (Supplementary Table 3) and purified as described for ApN/CM.
Protein extinction coefficient
Protein concentrations were determined spectrophotometrically at 280 nm on the basis of their respective extinction coefficients. The extinction coefficient of a protein was determined using ProtParam on the ExPASy online server (https://web.expasy.org/protparam/)^69^. For homo-oligomeric proteins (SeCM, SeCA, SeApN and SeSIIApN), calculation included one copy of the respective protein sequence. For SeRubisco, one copy of the sequence of each of the RbcL and RbcS subunits was used. For hetero-complexes, for example (ApN)3:CM, the calculation included three copies of ApN and one copy of CM sequence.
Strep–Tactin pull-down assay
(SIIApN)4 (1.5 μM) was incubated with (CM)3 (1 μM), Rubisco (0.25 μM), (CA)4 (1.5 μM) or CM^Ct^ (3 μM) in buffer C (50 mM Tris-HCl pH 8.0, 150 mM KCl, 5 mM DTT) at 25 °C for 30 min. The mixture was loaded onto a pre-equilibrated gravity-flow Strep–Tactin XT column (IBA), washed with buffer C and eluted with buffer C/50 mM biotin. Fractions were analysed using SDS–PAGE and Coomassie staining.
Pulldowns of a mixture of (SIIApN)4 and (CM)3 under reducing or oxidizing conditions at 25 °C or 37 °C were performed as follows: Under reducing conditions, the pulldowns were carried out as above. Under oxidizing conditions, (SIIApN)4 and (CM)3 were first oxidized separately with 2 mM H_2_O_2_ on ice for 30 min. Next, the oxidized sample of (SIIApN)4 was loaded onto a PD MiniTrap G-10 column (GE) pre-equilibrated with buffer (50 mM Tris-HCl pH 8.0, 300 mM KCl), and the oxidized (CM)3 in buffer (50 mM Tris-HCl pH 8.0, 500 mM KCl) to remove unreacted H_2_O_2._ The two oxidized proteins were then mixed and incubated in buffer C (minus DTT) for 30 min. Finally, the reaction was loaded onto a pre-equilibrated gravity-flow Strep–Tactin XT column (IBA), washed with buffer C (minus DTT) and eluted with buffer C/50 mM biotin (minus DTT). Fractions were analysed using SDS–PAGE and Coomassie staining.
Sedimentation assay
Rubisco (0.5 μM), CM^Ct^ (2 μM), (CM)3 (0.25 μM), (CA)4 (0.125 μM) and (ApN)4 (0.25 or 2.5 μM) were incubated in buffer D (50 mM Tris-HCl pH 8.0, 100 mM KCl, 10 mM Mg(OAc)2, 5 mM DTT) at 25 °C for 10 min. Samples were immediately fractionated into total (T), supernatant (S) and pellet (P) by centrifugation (20,817 × g, 20 min, 25 °C), and analysed using SDS–PAGE and Coomassie staining.
Turbidity assay
Measurements were performed at 25 °C in buffer (50 mM Tris-HCl pH 8.0, 10 mM Mg(OAc)2) containing different concentrations of KCl and in the presence of 5 mM DTT as indicated in figures. Reactions (100 μl) containing proteins as stated in figures were rapidly mixed by vortexing, and absorbance at 340 nm was monitored over time on a Jasco V-560 spectrophotometer set to 25 °C. Data were plotted using OriginPro 2020.
Condensate formation analysed by FM
Proteins analysed by FM were fluorescently labelled at the N terminus. Rubisco holoenzyme was labelled with Alexa Fluor 568 NHS ester (ThermoFisher) (Rubisco_AF568_) according to manufacturer instructions (~6.5 dye molecules bound per holoenzyme). CM^Ct^, (CM)3 and (CA)4 proteins were reduced by adding 5 mM DTT to purified proteins before use. (ApN)4, (ApN)3:CM, (ApN)3:CM^Nt^ and CM^Ct^ were labelled with the fluorophore Alexa Fluor 405 NHS ester (ThermoFisher) (~1.3, ~1.6, ~1.2 and ~0.7 dye molecules bound per (ApN)4, (ApN)3:CM, (ApN)3:CM^Nt^ and CM^Ct^, respectively). (CM)3 and CM^Ct^ were labelled with the fluorophore Alexa Fluor 488 NHS ester (ThermoFisher) (~3.7 and ~0.9 dye molecules bound per (CM)3 and CM^Ct^, respectively), while (CA)4 and CM^Ct^ were labelled with the fluorophore Alexa Fluor 647 NHS ester (ThermoFisher) (~2.7 and ~1.8 dye molecules bound per (CA)4 and CM^Ct^, respectively). Labelled protein was mixed with unlabelled protein at a ratio of 1:10. Reactions (20 μl) were performed in buffer (50 mM Tris pH 8.0, 10 mM Mg(OAc)2, 5 mM DTT) containing 50 mM or 100 mM KCl and protein concentrations as stated in figures. Proteins were combined for times as indicated in the figures at 25 °C before transfer to an uncoated chambered coverslip (μ-Slide angiogenesis, Ibidi) and incubated for 5 min before analysis. Samples were illuminated with a 405 nm diode laser, 488 nm argon laser and white light lasers (NKT superK EXTREME) (at 570 nm and 647 nm) for fluorescence imaging. Images were recorded by focusing on the bottom of the plate using a Leica TCS SP8 AOBS confocal laser scanning microscope (HCX PL APO ×63/1.2 water objective, PMT detector).
Fiji software^70^ was used to generate condensate images. Image contrast was adjusted to optimize visibility of the condensates, and a Gaussian blur filter (sigma = 2 pixels) was applied to reduce pixel noise and smoothen images.
FRAP
FRAP experiments were carried out with a Leica TCS SP8 AOBS confocal laser scanning microscope (HCX PL APO ×63/1.2 water objective, PMT detector). Labelled protein (see above) was mixed with unlabelled protein at a ratio of 1:10. CM^Ct^AF405 (2 μM) and (CM)3AF488 (0.25 μM) were mixed with 1 μM unlabelled Rubisco holoenzyme (20 μl reactions) in buffer D for FRAP analysis of CM^Ct^ and (CM)3, while FRAP analysis of Rubisco was performed by mixing 1 μM Rubisco_AF405_ (~3.5 dye molecules per Rubisco holoenzyme) with 2 μM unlabelled CM^Ct^ and 0.25 μM unlabelled (CM)3. After 5 min incubation at 25 °C, reactions were transferred to an uncoated chambered coverslip (μ-Slide angiogenesis, Ibidi) followed by another 15 min at 25 °C before analysis. Images before and 5 min after photobleaching were recorded in a single focal plane at a 5 s time interval. Bleaching was performed with a bleach point model using a 405 nm diode laser at 3% intensity or a 488 nm argon laser at 100% intensity in three repeats with a dwell time of 100 ms. Fiji software was used for image analysis^70^.
SEC–MALS
Purified proteins (2 mg ml^−1^) were analysed using static and dynamic light scattering by auto-injection of the sample onto a SEC column (Superdex 200 Increase 10/300 GL, GE) at a flow rate of 0.35 ml min^−1^. Proteins were analysed at r.t. in buffer (50 mM Tris pH 8.0, 150 mM KCl, 5 mM DTT), except for CM and CM-H_10_ which were analysed in buffer containing 500 mM KCl (Supplementary Table 1). The column was in line with detectors: variable UV absorbance set at 280 nm (Agilent 1100 series), DAWN EOS MALS (Wyatt Technology, 690 nm laser) and Optilab rEX refractive index (Wyatt Technology, 690 nm laser)^71^. Molecular masses were calculated using ASTRA 5 (Wyatt Technology) with the dn/dc value set to 0.185 ml g^−1^. Bovine serum albumin (ThermoFisher) was used as the calibration standard.
For analysis of oxidized protein samples, DTT was first removed using a PD MiniTrap G-10 column (GE) equilibrated in buffer (50 mM Tris-HCl pH 8.0, 300 mM KCl). Then, the protein (2 mg ml^−1^) was incubated with 5 mM H_2_O_2_ on ice for 60 min before SEC–MALS in buffer (50 mM Tris-HCl pH 8.0, 150 mM KCl).
The plots shown in Figs. 1d, 2b and 3d, and in Extended Data Fig. 5a,b were generated using SigmaPlot 14.
ITC
ITC measurements were carried out on an ITC200 calorimeter (Microcal, GE) at 20 °C. After dialysis into buffer (50 mM Tris pH 8.0, 100 mM KCl, 10 mM Mg(OAc)2, 1 mM TCEP), (CA)4 (264.6 μM) or E_GFP_CA_C17_ (407.85 μM) was loaded into the syringe and the sample cell filled with 300 μl of (ApN)3:CM^Nt^ (35.3 μM) and the reference cell with buffer. For each measurement point, 10 μl of (CA)4 or E_GFP_CA_C17_ was injected at time intervals of 3 min. Data were analysed using the OriginPro 2020 software and fitted with a one-site binding model.
Cryo-EM
Sample preparation and data collection
Holey carbon-supported copper or gold grids (Quantifoil R1.2/1.3 300 mesh) were plasma cleaned for 30 s (Harrick Plasma) before use, except for the copper grids (Quantifoil R2/1 300 mesh) used for oxidized SIIApN/CM-H_10_. Samples were diluted to working concentrations immediately before application to the glow-discharged grids. A sample volume of 3.5 μl was applied to each grid at 25 °C and 90% humidity, then semi-automatically blotted and plunge frozen into liquid ethane using a Vitrobot Mark 4 system (ThermoFisher).
(i) (ApN)4
Purified (ApN)4 was diluted to 6.2 μM in buffer (50 mM Tris-HCl pH 8.0, 300 mM KCl, 5 mM DTT) and cryo-grids were prepared as described above. Grids were screened on a Glacios transmission electron microscope (ThermoFisher) equipped with a K2 summit direct electron detector (Gatan), operated at 200 keV. The selected grid on stage was used for data collection directly with the K2 summit. A total of 628 videos were automatically collected by SerialEM (10.1016/j.jsb.2005.07.007) using a pixel size of 1.181 Å. The total exposure time of 18.4 s was divided into 40 frames with an accumulated dose of 63 electrons Å^−2^ and a defocus range of –1.0 to –2.8 μm.
(ii) SIIApN/CM-H10
Purified (SIIApN)/CM-H_10_ protein complex was diluted to 4 μM in buffer (50 mM Tris-HCl pH 8.0, 100 mM KCl, 5 mM DTT). Cryo-grids were prepared and a total of 1,221 videos were collected as described in (i).
(iii) Oxidized SIIApN/CM-H10
Purified (SIIApN)/CM-H_10_ was pre-incubated with 2 mM H_2_O_2_ on ice for 30 min, then diluted to 4 μM in buffer (50 mM Tris-HCl pH 8.0, 100 mM KCl, 10 mM Mg(OAc)2) and incubated for 10 min at 25 °C. Cryo-grids were prepared as above and screened on a Talos Arctica transmission electron microscope (ThermoFisher) equipped with a K3 direct electron detector (Gatan). Data collection was performed using SerialEM, with 2,881 videos collected at a pixel size of 1.171 Å. The total exposure time of 6.4 s was divided into 40 frames with an accumulated dose of 63 electrons Å^−2^ and a defocus range of –1.0 to –2.8 μm.
(iv) ApN/CM, ApN-2A/CM or ApN-2S/CM
Purified ApN/CM, ApN-2A/CM or ApN-2S/CM complexes were diluted to 4 μM in buffer (50 mM Tris-HCl pH 8.0, 100 mM KCl, 5 mM DTT). Cryo-grids were prepared and a total of 1,953, 1956 and 2,129 videos were collected, respectively, as described in (iii).
Image processing
All datasets were processed with cryoSPARC v.4.2 (10.1038/nmeth.4169). The raw videos were imported into cryoSPARC for all subsequent processing steps (including patch motion correction, patch contrast transfer function estimation, blob picking, template picking, 2D classification). Details on particle number and resolution are presented in Extended Data Figs. 1b, 4a,b,d and 5d,e. Cryo-EM analysis was validated by converting the asymmetric unit of the crystal structure (PDB:7D6C)^36^ to a 10 Å resolution map using molmap (UCSF Chimera tool), followed by the generation of orientation-spanning reference projections in cryoSPARC. These projections are consistent with our experimental data and closely resembled the 2D classes of the dimer-of-hetero-trimers shown in Fig. 3a,e and Extended Data Figs. 4a,b and 5d.
Cyanobacterial transformation
Se7942 transformations were performed by collecting a 7-day-old lawn of cells grown on modified BG-11 plates (1% agar)^72^. Cells were washed and resuspended in modified BG-11 to an optical density (OD_730_) of ~5.0. Approximately 1 µg plasmid DNA was added to 200 µl aliquots of cells and incubated at 22 °C in the dark for 24 h. Cells were then spread onto modified BG-11 plates supplemented with appropriate antibiotics (Kanamycin, 25 µg ml^−1^; Spectinomycin, 10 µg ml^−1^) and 2 mM sodium thiosulfate, and grown at 30 °C in 3% CO_2_ (v/v in air) until single colonies appeared. For the generation of deletion cell lines, single colonies were transferred to fresh modified BG-11 plates and subcultured three times before genetic segregation was confirmed by diagnostic PCR.
Cell growth analysis
Se7942 cells and ApN mutants were first grown over 2–3 days in ~3% (v/v) CO_2_ and diluted in fresh modified BG-11 medium to an optical density (OD_730_) of 0.1 in triplicate 35 ml cultures. Cultures were grown at 30 °C, 80 µmol photons m^−2^ s^−1^ and bubbled with air. Hourly culture measurements (OD_730_) were conducted over a period of 8 h, and the maximum growth rate was determined from the slope of logarithmic regressions of the data and transformed into doubling times. Data were plotted in SigmaPlot 14.
Cell ultrastructure analysis
Se7942 cells were grown in modified BG-11 medium containing 20 mM HEPES-KOH pH 8.0, at 3% CO_2_, 30 °C and ~80 μmol photons m^−2^ s^−1^ over 2–3 days. Cells in 10 ml culture were fixed for at least 4 h with 4% formaldehyde/2.5% glutaraldehyde (Electron Microscopy Sciences) in PBS (pH 7.4). Cells were then centrifuged at 5,000 × g for 5 min, washed three times with PBS, and then post fixed in 1% (w/v) osmium tetroxide for 4 h. Fixed cells were dehydrated through a graded ethanol series and embedded in LR white resin (ProSciTech). Ultrathin sections were obtained using an ultramicrotome (Leica EM UC7), stained with 2% (w/v) uranyl acetate followed by lead citrate, and viewed using a JEOL JEM F200 transmission electron microscope at 80 kV. Images taken with a Gatan Rio 16 camera were analysed using DigitalMicrograph 3.5. ImageJ 1.54 m was used for carboxysome dimension measurements. Microsoft Excel for Microsoft 365 MSO (v.2505, Build 16.0.18827.20102, 64-bit) was used for data collation. Data were plotted in OriginPro 2020.
AlphaFold and structure analysis
AlphaFold 3 (AF3) (https://alphafoldserver.com/) was used to model the (ApN)4 homo-tetramer (four full-length ApN protomers, residues 1–161) and (ApN)3:CM^Nt^ hetero-tetramer (three ApN protomers plus one CM^Nt^, residues 1–198). Structural figures were generated in PyMol (http://www.pymol.org/).
Statistics
Statistical analysis of carboxysome sizes and cell growth rates were carried out with OriginPro 2020 or Graphpad Prism v.10 using one-way analysis of variance (ANOVA) with multiple comparisons. All relevant biochemical experiments were replicated two or three times. No statistical methods were used to predetermine sample size, but our sample sizes are similar to those reported in previous publications^56^.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Supplementary InformationSupplementary Tables 1–3. Reporting Summary
Source data
Source Data Fig. 1. Statistical source data. Source Data Fig. 2. Statistical source data. Source Data Fig. 3. Statistical source data. Source Data Fig. 4. Statistical source data. Source Data Fig. 5. Statistical source data. Source Data Fig. 6. Statistical source data. Source Data Extended Data Fig. 5. Statistical source data. Source Data Extended Data Fig. 6. Statistical source data. Source Data Extended Data Fig. 7. Statistical source data. Source Data Extended Data Fig. 8. Statistical source data. Source Data Extended Data Fig. 9. Statistical source data.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1de Pins, B. et al. Rubisco is slow across the tree of life. Proc. Natl Acad. Sci. USA 122, e 2501433122 (2025).10.1073/pnas.2501433122 PMC 1266392741248286 · doi ↗ · pubmed ↗
- 2Sun, Y. et al. Rubisco packaging and stoichiometric composition of the native β-carboxysome in Synechococcus elongatus PCC 7942. Plant Physiol. 197, kiae 665 (2024).10.1093/plphys/kiae 665PMC 1197343039680612 · doi ↗ · pubmed ↗
- 3Baker, R. T. et al. Ubiquitin and Protein Degradation, Part A. in Methods in Enzymolology Vol. 398 (ed. Deshaies, R. J.) 540–554 (Academic Press, 2005).
- 4Gasteiger, E. et al. in The Proteomics Protocols Handbook (ed. Walker, J. M.) 571–607 (Humana Press, 2005).
