Branched DNA processing by a thermostable CAS-Cas4 from Thermococcus onnurineus: Expanding biochemical landscape of nuclease activity
Muskan Jain, Asish Kumar Pattnayak, Sakshi Aggarwal, Praveen Rai, J. Kavya, Sanjeev Chandrayan, Manisha Goel, Vineet Gaur

TL;DR
This paper explores a CAS-Cas4 protein from an archaea species, revealing its unique DNA processing abilities.
Contribution
The study provides the first biochemical characterization of a CAS-Cas4 protein from archaea.
Findings
TON_0321 shows 5′ to 3′ exonuclease activity.
It has structure-dependent endonuclease activity that cleaves near DNA branch points.
The protein's catalytic site is uniquely arranged for branch recognition.
Abstract
The adaptive immune function of CRISPR–CRISPR-associated protein (Cas) systems in bacteria and archaea is mediated through Cas. The adaptation module, typically involving Cas1, Cas2, and Cas4, helps integrate viral “spacer” sequences into the host genome. Cas4 proteins are classified into two types based on neighboring genes: CAS-Cas4, flanked by other cas genes, and Solo-Cas4, which exists independently. While CAS-Cas4 proteins are implicated in adaptation, they remain biochemically uncharacterized in archaea, unlike archaeal Solo-Cas4 proteins. This study biochemically characterizes TON_0321, a CAS-Cas4 protein from the type IV-C CRISPR cassette of Thermococcus onnurineus. TON_0321 exhibits 5′ to 3′ exonuclease activity and unique structure-dependent endonuclease activity, shedding light on CAS-Cas4 functional diversity. A distinct spatial organization of the catalytic site, angled…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCRISPR and Genetic Engineering · RNA Interference and Gene Delivery · RNA and protein synthesis mechanisms
CRISPR are a type of acquired immune system in bacteria and archaea (1, 2, 3), which are already proving to be powerful tools in targeted genome editing (1, 2, 3, 4, 5, 6, 7, 8, 9). Their function is mediated by a variety of proteins, collectively termed CRISPR-associated proteins (Cas proteins). Currently, CRISPR function is understood to be mediated by either a group of proteins (class 1) or via a single protein with multiple functional domains working as an effector protein (class 2). These two classes are further subdivided into several types and subtypes. The classification is based on the presence of “unique signature” proteins for each CRISPR-Cas type. Class 1 systems encompass type I, III, and IV systems, whereas class 2 CRISPR systems are subdivided into three types: II, V, and VI, involving Cas9, Cas12, and Cas13 proteins, respectively. Despite the diverse set of proteins serving as effectors among different CRISPR types, all CRISPR-Cas systems appear to operate in three stages: adaptation, where foreign DNA is incorporated into CRISPR arrays; expression, involving transcription and processing of CRISPR RNA from integrated spacers; and interference, where CRISPR RNA is used to guide the degradation of the invading genomic material, conferring immunity against reinvading pathogens (10, 11, 12, 13, 14). The adaptation modules of CRISPR systems typically contain Cas1 and Cas2, which excise, process, and integrate foreign DNA (prespacers) into the CRISPR-Cas loci as a new spacer (15, 16). However, several CRISPR types (I, II, and V) have an additional protein, termed Cas4, involved in the adaptation step (15) (Fig. 1A). Processing prespacers into spacer fragments of the correct size with appropriate ends is quintessential for integration into a specific orientation (17, 18, 19). Incorrect spacer processing compromises the defense response (20, 21). Cas4 proteins have been shown to determine the length and orientation of the new spacers by recognizing specific short sequences called protospacer adjacent motifs (PAMs) (20, 21).Figure 1**Diversity among Cas4 proteins.**A, block diagram showing the typical organization of adaptation module in different CRISPR subtypes. Adaptation module usually consists of Cas1, Cas2, and Cas4 proteins. It may contain other proteins in specific subtypes. A few of the CRISPR subtypes does not include Cas4. The typical type IV-C CRISPR subtype does not contain adaptation module. The atypical type IV-C CRISPR cassette from Thermococcus onnurineus NA1 shows the presence of Cas2 and Cas4 adaptation proteins. Type I-A, I-B, III-B: different CRISPR subtypes; type IV-C: typical type IV-C cassette; type IV-C∗: atypical case of T. onnurineus NA1. B, a cartoon depicting conserved amino acid residues among various characterized Cas4 proteins: TON_0321 (T. onnurineus NA1), Pcal_0546 (Pyrobaculum calidifontis), SSO0001 (Sulfolobus solfataricus), SSO1391 (S. solfataricus), and GsCas4 (Geobacter sulfurreducens). Four cysteines involved in chelating iron–sulfur cluster are shown in red. RecB active site residues are shown in pink and orange; the blue-shaded region depicts the RecB-like domain of Cas4 proteins. C, a schematic showing the difference in gene cassettes of SSO0001 from S. solfataricus, Pcal_0546 from P. calidifontis, GsCas4 from G. sulfurreducens, and TON_0321 from T. onnurineus NA1. D, comparison of the catalytic sites of λ exonuclease (cyan; Protein Data Bank code: 1AVQ) and TON_0321 (red, model) depicting the conservation of active site residues. The metal ion at the active site is shown as a sphere. RecB, recombination protein B.
Cas4 proteins belong to the proline–aspartate–(aspartate/glutamate)–X–lysine (PD-(D/E)XK) endonuclease-like domain superfamily (22). PD-(D/E)XK nucleases form a highly diverse superfamily of enzymes comprising restriction endonucleases (e.g., EcoRI, XhoI) (23), resolvases (e.g., phage lambda exonuclease, very short patch repair endonucleases) (24), transposases (e.g., Tn7 transposase) (25), DNA binding (e.g., RPB5) (26), repair (e.g., MutH) (27), and recombination (e.g., SOX) proteins (28). Based on their genomic location and the flanking genes, Cas4 proteins are of three types: CAS-Cas4 (CRISPR-associated Cas4 [CAS-Cas4] genes that occur as part of an array of Cas genes), Solo-Cas4 (Cas4 genes located outside the CRISPR–Cas locus), and MGE-Cas4 (Cas4 associated with mobile genetic elements) (29). CAS-Cas4s, being associated with CRISPR genes, participate in CRISPR-related adaptive immunity (29). The Solo-Cas4s are expected to carry out non-CRISPR functions, such as DNA repair and recombination (29), although some studies have shown their involvement in CRISPR function too (21). Likewise, MGE-Cas4 is predicted to be involved in the transposition of mobile genetic elements (29).
Three archaeal Cas4 proteins (SSO0001 and SSO1391 from Sulfolobus solfataricus and Pcal_0546 from Pyrobaculum calidifontis) have been biochemically characterized. Interestingly, these characterized enzymes showed varied oligomeric states: monomer (Pcal_0546), dimer (SSO1391), and decamer (SSO0001) (30, 31, 32). Mutational studies revealed that the four conserved cysteine residues are important for iron–sulfur (Fe–S) cluster binding (30, 31). Cas4 are sequence-dependent nucleases, recognizing PAM sequences (17, 23, 25, 33). Although these three characterized proteins were hypothesized to produce ssDNA overhangs, thereby facilitating the insertion of spacers into the CRISPR array, none of the three characterized proteins, SSO0001, SSO1391, and Pcal_0546, have shown a preference for any PAM sequence in the in vitro experiments (30, 31, 32).
In various CRISPR systems, Cas4 proteins have been reported to coordinate differently in spacer processing. In the type I-A system from Pyrococcus furiosus, CAS-Cas4 processes the PAM end, whereas Solo-Cas4 handles the non-PAM end (21). Similarly, in the type I-A system from Sulfolobus islandicus, Cas4 and CsaI proteins process both ends (34). In the type I-C system from Arthrobacter halodurans, Cas4 processes the PAM end, with the non-PAM end being processed by a cellular exonuclease or Cas1 (35). In general, Cas4, in conjunction with Cas1 and Cas2, is crucial for spacer generation and integration across type I-A, I-B, and I-C subtypes. In the type I-G and type V-B systems, a Cas1–4 fusion protein exists, with the Cas4 domain recognizing PAM (36, 37). In summary, the Cas4 protein appears to show PAM specificity only with Cas1 and Cas2 (35, 37, 38).
While several Cas4 proteins have been studied in vitro and in vivo, their function remains elusive because of significant functional variability. Notably, even among the CAS-Cas4s, their role varies according to different CRISPR subtypes. Incidentally, all three in vitro characterized archaeal Cas4 proteins (SSO0001, SSO1391, and Pcal_0546) were later shown to belong to the Solo class of Cas4. Remarkably, no CAS-Cas4 protein of archaeal origin has been characterized in vitro until now. Recently, a bacterial Cas4–Cas1 fusion protein (GsCas4, accession no.: Q74H36.1) from Geobacter sulfurreducens (of type IG) has been characterized (36). In this context, we present the novel structure-selective endonuclease activity of archaeal CAS-Cas4 protein from Thermococcus onnurineus, TON_0321 (protein accession no.: TON_0321), associated with the type IV-C CRISPR cassette.
Results
Types of Cas4 systems
Cas4 proteins, members of the PD-(D/E)XK nuclease superfamily, are comprised of a recombination protein B (RecB) domain, three C-terminal cysteines, one N-terminal cysteine, and an Fe–S cluster. We performed a multiple sequence alignment of Cas4 proteins from various archaeal species using MUSCLE (MUltiple Sequence Comparison by Log-Expectation) (39) (Fig. S1A) and generated a phylogenetic tree using the MEGA (Molecular Evolutionary Genetics Analysis) software suite (Fig. S2) (40). A multiple sequence alignment of archaeal Cas4 proteins shows the conservation of amino acid residues characteristic of the PD-(D/E)XK superfamily, RecB motif, and QhXXY domain (Fig. S1A). The domain arrangement observed with archaeal Cas4 proteins is typical of the AddB family of exonucleases, known for their role in bacterial DNA recombination (41). The four cysteine residues chelating an Fe–S cluster and metal ion–chelating residues at the active site are also conserved across the Cas4 members (Figs. 1B, S1A).
Next, we examined each protein's genomic location to identify its flanking genes (Fig. S3). Phylogenetic analysis (40, 42) and genomic location categorized the Cas4 proteins into two distinct clades: the CAS-Cas4 and the Solo-Cas4 proteins (Fig. S2). The CAS-Cas4 proteins are part of a larger CRISPR system, surrounded by other Cas proteins. For example, in the CRISPR cassette of the protein TON_0321 (Fig. 1C), Cas4 is located next to several other Cas proteins, including Cas6, Cas10, Cas11, Cas7, Cas5, and Cas2. Because of its association with these other proteins, Cas4 in this case is called a CRISPR-associated Cas4 protein. On the other hand, Solo-Cas4 proteins are not part of the CRISPR cassette and are not flanked by other Cas proteins. For instance, proteins like Sso0001 and Pcal_0546 are surrounded by genes that are not related to Cas proteins (Fig. 1C). Based on their phylogenetic separation into distinct clades, we anticipate that Cas-Cas4 and Solo-Cas4 proteins exhibit functional divergence. The diversity of Cas4 proteins across different systems emphasizes the need to study each variant individually to fully comprehend their distinct roles, biochemical properties, and potential applications. Therefore, we decided to biochemically characterize the CAS-Cas4 protein to better understand how it might differ from the solo types. This study is the first to investigate a CAS-Cas4 protein in depth, giving us a new basis for comparing it with the solo-type Cas4 proteins.
TON_0321: Cas4 protein from T. onnurineus
The gene for Cas4 protein in T. onnurineus (TON_0321 protein) is present adjacent to a type IV-C CRISPR cassette, along with the presence of type III effectors (Cas2, Cas5, Cas7, Cas10, and Cas11) (Fig. 1C). Cas4 proteins are often located adjacent to Cas1 and Cas2 genes in most CRISPR systems, together forming the adaptation module (33, 43). Sometimes, Cas4 is fused with Cas1, as seen in subtype V-B and the type I-G system of Methanosarcina barkeri (15, 36, 37, 44, 45, 46, 47, 48, 49) (Fig. 1A). Cas4 and Cas2 (but not Cas1) are present next to the type IV-C CRISPR cassette of T. onnurineus (Fig. 1C). Since Cas1, Cas2, and Cas4 form a functional module in many CRISPR subtypes, and Cas1 is absent altogether in the T. onnurineus type IV-C CRISPR cassette, we examined the potential interaction between TON_0321 and the Cas2 protein from T. onnurineus. Interestingly, we did not observe a stable interaction between TON_0321 and Cas2 protein under in vitro conditions (Fig. S4). This intriguing outcome led us to explore whether TON_0321 in T. onnurineus is a putative protein or exhibits a previously unexplored biochemical profile.
We assessed the integrity of the catalytic site in TON_0321 by predicting its secondary structure, identifying disordered regions (Fig. S5), and modeling its tertiary structure using AlphaFold2 (50, 51). The predicted structure of TON_0321 was then compared with those of related proteins, including Pcal_0546 from P. calidifontis (Protein Data Bank [PDB] code: 4R5Q), SSO0001 from S. solfataricus (PDB code: 4IC1), and GsCas4 from G. sulfurreducens (PDB code: 7MI4) (Fig. S6). Notably, Cas4 proteins, whether solo or CRISPR associated, exhibit a similar fold, sharing a conserved core motif with slight variations in the loop regions (Fig. S7). Furthermore, a structural comparison of the catalytic center of TON_0321 with λ exonuclease (PDB code: 1AVQ), a well-studied representative member of the PD-(D/E)XK nuclease superfamily, established that the catalytic center of TON_0321 exhibits the typical signature geometry characteristic of PD-(D/E)-XK nucleases (Fig. 1D). In the PD-(D/E)XK nuclease superfamily, to which λ exonuclease belongs, two metal ions (Mg^2+^) in the active site are crucial for catalysis: one activates a water molecule to attack the scissile phosphate, whereas the other stabilizes the transition state (52, 53). The divalent metal ions are chelated by conserved histidine, aspartate, and glutamate residues of the RecB motif. In TON_0321, D96 and E105 are identified as active site residues through sequence alignment (Fig. S1A) and conservation of the active site (Fig. 1D). The two residues were mutated to alanine (TON_0321^D96A/E105A^) and used as a catalytically inactive enzyme in subsequent experiments as a negative control (Fig. S1B). In summary, although TON_0321 does not form a stable complex with Cas2 under the in vitro conditions explored in the current study, it retains the signature residues associated with Cas4 proteins and adopts a fold consistent with other Cas4 family members. The absence of a detectable interaction between TON_0321 and Cas2 is unlikely to result from common artifacts, such as tag interference (Fig. S4E). However, we cannot exclude the possibility that a Cas2–Cas4 interaction may occur in the presence of additional factors under in vivo conditions. Taken together, these findings suggest that TON_0321 from T. onnurineus may be a catalytically active nuclease, prompting further investigation into its nuclease activities.
Optimum catalytic conditions for TON_0321
The TON_0321 protein was purified using affinity chromatography followed by size-exclusion chromatography. To comprehensively characterize the nuclease potential of TON_0321, we optimized a wide range of conditions, encompassing protein concentrations, temperature, pH, salt concentrations, and metal ions using ssDNA labeled with cyanine-5 fluorophore (Cy5) at the 3′-end. TON_0321 proved to be thermostable and catalytically active over a wide range of temperatures, pH, and NaCl concentrations. The high thermostability of TON_0321 reflects the thermophilic characteristic of T. onnurineus. Cas4 enzymes typically follow a divalent metal ion–based nucleophilic attack mechanism to break a phosphodiester bond (52, 54, 55). Nuclease assays were then conducted at either 35 °C, to preserve the secondary structure of DNA substrates, or 55 °C, to ensure a fully extended state of ssDNA. A pH of 8.0 and 125 mM NaCl concentration were selected for subsequent experiments based on the stability of protein stability and the annealed synthetic DNA substrates. In addition, a concentration of 2.5 mM MgCl_2_ was used for the nuclease assays.
TON_0321 exonuclease activity
We explored the exonuclease activity of TON_0321 using two ssDNA substrates labeled with fluorophores: one labeled with 6-carboxyfluorescein (6-FAM) at the 5′ end (Y0-1) and the other with Cy5 at the 3′ end (Y0-4). Experiments were conducted at 35 °C and 55 °C. For the 5′ 6-FAM-labeled DNA, Y0-1, we observed a major product of five to six bases within 2.5 min, increasing over time, indicating 5′ to 3′ exonuclease activity. Similarly, for the 3′ Cy5-labeled substrate (Y0-4), we observed a series of products gradually catalyzing to a smaller fragment of around four bases over time, further supporting 5′ to 3′ exonuclease activity (Figs. 2A, S8). Using the Y0-1 substrate, intermediate-sized products were observed, which may be attributed to endonuclease activity, as explored in subsequent sections.Figure 2**Exonuclease activity of TON_0321 on ssDNA.**A, the catalytic activity of TON_0321 on ssDNA labeled at 5′-end with 6-FAM (upper panel) (Y0-1) and ssDNA labeled with Cy5 at 3′-end (lower panel) (Y0-2). The reaction products were resolved on an 18% TBE–urea PAGE. Upper panel gels were scanned for 6-FAM signal, and the lower panel gels were scanned for Cy5 signal. The activity of TON_0321 on ssDNA labeled at 5′ end with 6-FAM, with biotin at 3′ end (Y0-1_biotin), and ssDNA labeled with Cy5 at 3′ end, and with biotin at 5′ end (Y0-4_biotin). The biotinylated oligos were incubated with 2.5 times molar excess of streptavidin to completely block the biotinylated end of the oligo. The biotin–streptavidin-conjugated oligos were used as substrates for activity assay. The reaction products were resolved on an 18% TBE–urea PAGE. Gels were scanned for 6-FAM and Cy5 signals. M represents a marker made from mixing synthetic oligonucleotides of different sizes (40, 20, 10, and 6 nucleotides), and L represents a ladder made from DNase digestion of the 40-mer substrate. All experiments were done in triplicates (Figs. S8, S9). For experiments with ssDNA labeled with Cy5 at 3′end (lower panel), samples for 35 °C and 55 °C were run on the same gel with a common ladder and marker. B, a schematic illustrating the processing of DNA substrate by processive and distributive enzymes. Processive enzymes perform multiple cleavage cycles on a single-bound substrate before releasing it, preventing its binding to a new, unlabeled substrate introduced midreaction. Distributive enzymes catalyze a single cleavage per substrate binding, detaching afterward and becoming available to interact with subsequently added substrate molecules. C, TON_0321 acts in a distributive manner. An activity assay reaction was set with 3′ Cy5-labeled ssDNA, and formation of products was observed over different time points. In a parallel reaction, 20-fold excess unlabeled DNA of the same sequence was added after 2 min of time point. A decrease in the cleavage of the labeled product in comparison to the control reaction indicated that TON_0321 can dissociate from its substrate and catalyze the reaction in a distributive manner. The reaction products were resolved on an 18% TBE–urea PAGE. The gels were scanned for Cy5 signal. M represents a marker made from mixing synthetic oligonucleotides of different sizes (40, 20, 10, and 6 nucleotides), and L represents a ladder made from DNase digestion of the 40 -er substrate. All experiments were done in triplicates (Fig. S11). Cy5, cyanine-5 fluorophore; 6-FAM, 6-carboxyfluorescein; TBE, Tris–borate–EDTA.
To rule out any possibility of 3′ to 5′ exonuclease activity, we used ssDNA substrates blocked with a biotin–streptavidin conjugate at one end and a fluorophore at the other end. The 5′ 6-FAM-labeled substrate (Y0-1_biotin) was blocked with biotin at the 3′ end, whereas the 3′ Cy5-labeled substrate (Y0-4_biotin) was blocked with biotin at the 5′ end (Figs. 2A, S9). These substrates were incubated with streptavidin to form biotin–streptavidin conjugates, followed by an activity assay with TON_0321 at 35 °C. The reactions were analyzed using 18% Tris–borate–EDTA (TBE)–urea PAGE. For Y0-1_biotin, a small (∼6 bp) product accumulated early, and there was no product formation with Y0-4_biotin, confirming 5′ to 3′ exonuclease activity (Figs. 2A, S9). A double mutant of metal-chelating residues (i.e., D96 and E105), TON_0321^D96A/E105A^, was used as a negative control (Fig. S10).
Next, we aimed to determine how TON_0321 interacts with DNA molecules, specifically whether it acts processively (sliding along the DNA and catalyzing multiple consecutive reactions without releasing the substrate) or distributively (performing only a single reaction with the substrate before releasing it) (56). An exonuclease assay was conducted with ssDNA Y0-4 labeled with Cy5 at the 3′-end at 35 °C. After 1 min, a 20-fold excess of unlabeled ssDNA was added to the reaction, which was followed for 25 min (Fig. 2B). As a control, a similar reaction was performed without adding the excess unlabeled DNA. In this experiment, TON_0321 would have bound the labeled substrate and started its catalysis before adding the unlabeled substrate to the reaction. If the enzyme is processive, the addition of unlabeled substrate would not affect the reaction, and the enzyme would continue to catalyze the labeled substrate it has already bound. Such an enzyme would slide over the bound substrate, completing its catalysis before binding to another substrate molecule. On the other hand, the distributive enzyme would dissociate from the bound labeled substrate molecule after catalyzing one cleavage event. It would have to bind again for the next round of catalysis. In such a case, after the addition of an unlabeled substrate, the enzyme might bind to the unlabeled substrate as it is present in excess, and this would cause a decrease in cleavage of the labeled substrate compared with the control reaction. Interestingly, the addition of unlabeled DNA reduced the cleavage of labeled DNA compared with the control reaction (Figs. 2C, and S11), indicating that TON_0321 exhibits a distributive mode of action, cleaving once before dissociating from the substrate. In summary, TON_0321 functions as a 5′ to 3′ exonuclease, operating distributively on DNA substrates.
Endonuclease activity of TON_0321 on dsDNA plasmid
We tested the activity of TON_0321 on linear blunt-end dsDNA at 35 °C and observed no cleavage, suggesting its inability to cleave dsDNA. However, at 55 °C, the enzyme showed some catalytic activity, as evident from the accumulation of two fragments sized around 5 bases (major band) and 34 bases (minor band) (Figs. 3A and S12, S13). At 55 °C, the blunt ends of dsDNA could have melted and opened up, creating branches with ssDNA. This branching may have triggered the observed bands, resulting from either exonuclease or endonuclease activity on the branched DNA substrate.Figure 3**Endonuclease activity of TON_0321.**A, catalytic activity of TON_0321 protein on dsDNA with one strand labeled with Cy5 at the 3′-end at two different temperatures, 35 °C and 55 °C. The reaction products were resolved on an 18% TBE–urea PAGE. The gels were scanned for Cy5 signal. M represents a marker made from mixing synthetic oligonucleotides of different sizes (40, 10, and 6 nucleotides), and L represents a ladder made from DNase digestion of the 40-mer substrate. All experiments were done in triplicates (Fig. S12). The major cleavage site in the schematic of the DNA substrate is marked by a solid arrow. B, a schematic diagram showing the cruciform assay using the double-stranded plasmid pIRbke8^mut^ containing an inverted repeat sequence that forms a cruciform-type extrusion. The cruciform-like structure on the plasmid contains two sites for EcoRI restriction endonuclease (marked by asterisk). The purified plasmid is in a supercoiled (S.C.) state, which on single nick forms nicked circular (N.C.) DNA, and two simultaneous nicks produce a linear (L) form of DNA. C, ethidium bromide stained 0.8% agarose gel showing results of cruciform assay carried out with TON_0321 protein. The lane with EcoRI represents positive control and with substrate alone represents negative control. All experiments were done in triplicates (Fig. S14). The graph in the lower panel shows the quantitation of different forms of DNA formed in the cruciform assay along with standard errors calculated from three independent experiments. The reaction was carried out for 40 min. Aliquots were taken out at different time points (0, 2.5, 5, 10, 20, and 40 min). D, size-exclusion chromatogram of TON_0321 (black) purified on a HiLoad 16/600 Superdex 200 pg gel filtration column along with the gel filtration markers (broken blue) (vitamin B12, myoglobin, ovalbumin, gamma globulin, and thyroglobulin). The peak corresponding to the TON_0321 protein is marked by an arrow. The fractions from this peak were run on a 12% SDS gel. The molecular weight of TON_0321 protein was calculated using a standard curve of markers (Fig. S17C). Cy5, cyanine-5 fluorophore; TBE, Tris–borate–EDTA.
To check secondary structure–guided endonuclease activity and to rule out any contribution from the exonuclease activity of TON_0321, we used a DNA substrate without free ends but featuring secondary structure elements. For this, a cruciform assay was performed with a cruciform containing plasmid pIRbke8^mut^ (substrate 1) (57, 58). The cruciform substrate (pIRbke^8mut^) is a supercoiled plasmid with secondary structure elements, containing an inverted repeat sequence that extrudes into a cruciform structure at 37 °C. A single nick by the nuclease dissolves this structure, producing a nicked circular plasmid, whereas two nicks result in linear DNA. These forms—supercoiled, nicked circular, and linear—are easily distinguished by agarose gel electrophoresis (Fig. 3B). EcoR1, which linearizes the plasmid; T7 endonuclease 1, which generates both linear and nicked circular plasmids (59); and Nt.BspQ1 (60), a nickase that produces a nicked circular plasmid, were used as positive controls (Fig. 3, Figs. S14–S16). Furthermore, a dsDNA plasmid derived from pIRbke8^mut^ but lacking the cruciform structure (substrate 2) was used as a negative control (Fig. S15).
In the cruciform assay, the activity of TON_0321 on the cruciform-containing plasmid (pIRbke8^mut^, substrate 1) resulted in the formation of nicked circular DNA, demonstrating its endonuclease activity on dsDNA. However, negligible product was observed because of the activity of TON_0321 on dsDNA plasmid derived from pIRbke8^mut^ but lacking the cruciform structure (substrate 2) in comparison to substrate 1 (Fig. S15). This indicated that TON_0321 requires a DNA branching point to exhibit its endonuclease activity. Since TON_0321 generated a nicked circular DNA as a product arising from a single nick, we anticipate it must be a monomer in the solution. In contrast, for linearization of the plasmid, two simultaneous and coordinated nicks are essential. Therefore, we determined the oligomeric state of TON_0321 by size-exclusion chromatography and found it to exist as a monomer in solution (Figs. 3D and S17) similar to SSO1391 (31), but unlike Pcal_0546, which is a dimer (31), and SSO0001, which forms a toroidal structure of five dimers (30). This oligomeric state remains unchanged under different protein concentrations (Fig. S17A).
Therefore, TON_0321 can identify branched DNA molecules, exhibit endonuclease activity, and operate as a monomer. In order to further validate the endonuclease activity of TON_0321, we carried out endonuclease assays on various synthetic branched DNA substrates.
Endonuclease activity of TON_0321 on branched DNA substrates
TON_0321 stands out for its endonuclease activity, a trait not thoroughly explored in other Cas4 proteins. While many PD-(D/E)XK superfamily members exhibit endonuclease capabilities, the Cas4 proteins previously studied (SSO0001, SSO1391, and Pcal_0546) demonstrated endonuclease activity solely on circular ssDNA plasmids, unable to cleave circular dsDNA plasmids (30, 31, 32). Prompted by the endonuclease activity of TON_0321 on cruciform plasmids and dsDNA at higher temperatures, we expanded our investigation to include its activity on various branched DNA structures, such as 5′ flap, 3′ flap, and splayed arm at 35 °C (Figs. 4, S18, Table S2). TON_0321 efficiently processed all these branched DNA substrates. In the case of the 5′ flap, TON_0321 cleaved the Cy5-labeled strand two nucleotides from the branching point in the 5′ direction, with no nick in the 6-FAM-labeled strand (Figs. 4, A and B and S19). With the 3′ flap substrate, TON_0321 cleaved only the Cy5-labeled strand two nucleotides away from the junction in the 5′ direction (Figs. 4, A and B and S19). In the case of the splayed arm substrate, TON_0321 generated a major nick at two nucleotides away from the branching point in the Cy5-labeled strand, whereas a minor nick is also observed on the 6-FAM-labeled strand as well but during later time points (Figs. 4, A and B and S19). Notably, TON_0321 did not make any major nicks on the 6-FAM-labeled strand across all substrates studied (Figs. 4, A and B, and S19). Catalytically inactive TON_0321^D96A/E105A^ was used as a negative control (Fig. S20).Figure 4**Endonuclease activity of TON_0321 on branched DNA substrates.**A, the catalytic activity of TON_0321 protein on branched DNA molecules: 5′ flap, 3′ flap, and splayed arm. Each substrate has two labels: one strand labeled at 5′-end with 6-FAM and another strand labeled at 3′-end with Cy5. The reaction products were resolved on an 18% TBE–urea PAGE. The same gel was scanned for 6-FAM signal (upper panel) and for the Cy5 signal (lower panel). M represents a marker made from mixing synthetic oligonucleotides of different sizes (40, 10, and 6 nucleotides), and L represents a ladder made from DNase digestion of the 40-mer substrate. All experiments were done in triplicates (Fig. S19). B, the schematic of various branched DNA substrates shows the major cleavage sites as solid arrows and the minor sites as broken arrows. C, binding study of TON_0321 with different DNA substrates: ssDNA (light blue), dsDNA (orange), 5′ flap (gray), 3′ flap (yellow), and splayed arm (dark blue) using fluorescence anisotropy. DEAA_5′ flap represents binding analysis of catalytically dead TON_0321^D96A/E105A^ in the presence of a 5′ flap substrate. 6-FAM-labeled substrates were used for anisotropy experiments. The Y-axis shows a change in anisotropy (A–A_0_), where A is observed anisotropy and A_0_ is anisotropy of DNA substrate alone. Cy5, cyanine-5 fluorophore; 6-FAM, 6-carboxyfluorescein; TBE, Tris–borate–EDTA.
We used fluorescence anisotropy to assess the binding of TON_0321 with various DNA substrates, both branched and nonbranched (Fig. S21A, Table S3). The substrates tested included ssDNA, blunt-end dsDNA, 5′ flap, 3′ flap, and a splayed arm, each labeled with a 5′ 6-FAM label on one of the arms. TON_0321 showed binding with all these substrates (Fig. 4C, Table S4). Despite no catalytic activity on dsDNA, TON_0321 still exhibited comparable binding to other DNA substrates (Fig. 4C). We also looked into the binding of TON_0321^D96A/E105A^ with the 5′ flap substrate. Notably, this active-site mutant, lacking key metal-chelating residues, exhibited reduced binding compared with the wildtype protein (Figs. 4C and S21B). In summary, TON_0321 is active with all tested branched substrates (5′ flap, 3′ flap, and splayed arm), cleaving only the Cy5-labeled strands. TON_0321 can bind to all DNA substrates, including dsDNA.
Sequence versus structure dependency
The consistent cleavage of the Cy5-labeled strand in all substrates by TON_0321 suggests that this activity might be sequence dependent, or the enzyme recognizes branch points in an orientation-specific manner. Cas4 proteins are known for identifying PAM sequences for spacer acquisition in the adaptation step (20, 21, 35, 37). The type I-E CRISPR system from Escherichia coli lacks Cas4 protein, and its function is complemented by the C-terminal loop of Cas1 protein as it performs PAM recognition. On the other hand, in other CRISPR types where Cas4 is present, the Cas1 protein lacks this C-terminal loop (61). Since the type IV-C CRISPR cassette under study lacks Cas1, we investigated if the Cas4 protein, TON_0321, has a sequence dependency. Given that TON_0321 consistently nicked only the Cy5-labeled strand, we tested this by reversing the sequence near the branch point in new 5′ flap (5′ flap-N) and 3′ flap (3′ flap-N) substrates (Figs. 5 and S22). Interestingly, TON_0321 continues to nick at the same position, two nucleotides away from the junction point in the 5′ direction, on the Cy5-labeled strand, regardless of the sequence, thus ruling out sequence dependency in the presented experimental set-up.Figure 5TON_0321 is a secondary structure–selective endonuclease. The catalytic activity of TON_0321 protein on branched DNA molecules, 5′ flap-N and 3′ flap-N, with each having one strand labeled at 5′-end with 6-FAM and another strand labeled at 3′-end with Cy5. The reaction products were resolved on 18% TBE–urea PAGE. The gels were scanned for (A) 6-FAM signal and (B) Cy5 signal. M represents a marker made from mixing synthetic oligonucleotides of different sizes (40, 20, 10, and 6 nucleotides), and L represents a ladder made from DNase digestion of the 40-mer substrate. All experiments were done in triplicates (Fig. S22). The major cleavage site in the schematic of branched DNA substrates is marked by a solid arrow. DNA binding model of TON_0321 with (C) ssDNA, (D) dsDNA, and (E) splayed arm substrate. The bound DNA was modeled by superimposing the catalytic domain of TON_0321 with the bound structure of GsCas4 (Protein Data Bank code: 7MI4). Cy5, cyanine-5 fluorophore; 6-FAM, 6-carboxyfluorescein; TBE, Tris–borate–EDTA.
Next, we checked if the unique nicking of one specific arm in branched DNA substrates (i.e., the Cy5-labeled arm) by TON_0321 relates to its orientation-specific recognition at the branch point. Using AlphaFold2, we modeled the structure of TON_0321, with high-confidence prediction scores (predicted local distance difference test value of 87.44) (50, 51). The model showed a positively charged patch positioned at an angle to the catalytic site, capable of binding dsDNA. However, for DNA to reach the catalytic site for cleavage, it must bend, a flexibility more easily achieved at the branched points of DNA (Figs 5, C–E, S23). Many structure-selective endonucleases are known to utilize DNA bending and positively charged surface patches to carry out catalysis near the branching point (62, 63, 64). We further modeled ssDNA in the catalytic site of TON_0321 using GsCas4 (PDB code: 7MI4) and connected it to the dsDNA bound at the positively charged patch, creating a splayed arm structure (Fig. 5E). This model suggests that DNA bending is crucial for branched DNA recognition and the selective cleavage of the Cy5-labeled arm (Fig. 5E). Cleavage of the 6-FAM-labeled arm would require reverse DNA orientation (impacting the stereochemistry at the catalytic site) or reorientation of the dsDNA and branch point, as seen in PDB 7MI4. Unlike the PDB structure (7MI4), TON_0321 lacks stabilizing elements for dsDNA. Therefore, considering the observed catalytic behavior and our model predictions, it becomes clear that TON_0321 selectively targets branched DNA molecules.
Discussion
CAS-Cas4 proteins have been established to participate in the adaptation step, possibly through recognition of PAM sequences, thereby impacting the correct integration of functional spacers. However, their mode of action and specific role appear to be quite varied between various CRISPR systems/types. This is not surprising, given the high sequence variability among the Cas4 proteins. The phylogenetic studies have suggested that CAS-Cas4 proteins do exhibit CRISPR type–specific congruence in their evolutionary path (26), suggesting that the Cas4 of a particular CRISPR type–subtype should exhibit unique and specific sequence and structural features. Characterization of CAS-Cas4 proteins from different CRISPR types would aid in correlating their sequence and structural features with their experimentally observed function divergence. To date, three Cas4 proteins have been characterized biochemically from the archaeal systems: SSO0001 and SSO1391 from S. solfataricus and Pcal_0546 from P. calidifontis. Unlike TON_0321, none of the three Cas4 proteins belong to any CRISPR type but belong to the solo class of Cas4 proteins (30, 31, 32) (Fig. 1). Only a Cas4–Cas1 fusion protein from the bacteria G. sulfurreducens, Q74H36.1, belonging to CRISPR type I-G, has been recently characterized (36). Although Cas4 proteins are typically absent in type IV CRISPR systems (65, 66), the presence of a Cas4 gene (TON_0321 protein) adjacent to a type IV-C CRISPR system in T. onnurineus, marked an intriguing deviation from this pattern. Furthermore, the presence of Cas2 and the absence of Cas1 genes, and the inability of TON_0321 to form a stable complex with Cas2 protein, further underscore the intricate diversity and complexity inherent in Cas4 proteins across various organisms and CRISPR systems and warrants biochemical characterization.
Our studies have established that, as a PD-(D/E)XK superfamily member, the TON_0321 protein demonstrates characteristic divalent metal ion–dependent nuclease activity. TON_0321 exhibits 5′ to 3′ exonuclease activity akin to SSO0001 and Pcal_0546, whereas SSO1391 possesses both 5′ to 3′ and 3′ to 5′ exonuclease activities. In the present study, TON_0321 is classified as an exonuclease based on its requirement for a free 5′ end and the property of cleaving the substrate in a stepwise manner, even though it does not produce mononucleotides or dinucleotides as products but produces oligonucleotides instead. This feature, that is, the generation of oligonucleotides, is notably distinct from other archaeal Cas4 proteins characterized so far. TON_0321 does not degrade circular DNA unless it contains a branched structure. In this respect, TON_0321 is similar to other known exonucleases such as E. coli exonuclease V (RecBCD), exonuclease VII, and phage T5 exonuclease, all of which generate short oligonucleotide products while acting from DNA termini (67, 68, 69, 70).
Exploring the endonuclease potential of TON_0321 offers further valuable insights into its unique characteristics. TON_0321 did not cleave linear dsDNA at a lower temperature (35 °C), but it could cleave the dsDNA at a higher temperature (55 °C) because of the possible opening of the blunt ends. It also cut circular dsDNA with secondary structures, suggesting a new ability to recognize and cut branched DNA substrates, not seen in other archaeal Cas4 proteins (30, 31, 32) (summarized in Table 1). In addition, TON_0321 can cut several branched DNA molecules like 5′ flap, 3′ flap, and splayed arm, marking the first instance of a Cas4 protein being identified as a structure-dependent endonuclease. Most of the structure-dependent endonucleases possess structural features to identify branching in DNA substrates. Many of them recognize bending in DNA. The predicted structure of TON_0321 shows a highly positively charged surface at a sharp angle with the catalytic site allowing the possibility of recognizing sharp bends in branched DNA substrates. This is also reflected in the observation where TON_0321 can bind a dsDNA without catalyzing it but at the same time can catalyze many branched DNA substrates.Table 1A comparison of the catalytic characteristics of TON_0321 with the Cas4 proteins characterized from archaeaParametersSSO0001 (30, 32)SSO1391 (31, 32)Pcal_0546 (31, 32)TON_0321 (this study)Exonuclease activity 5′ to 3′ exonucleaseYesYesYesYes 3′ to 5′ exonucleaseNoYesNoNoEndonuclease activity SubstratesCircular ssDNA of M13mp18 phageCircular dsDNA with cruciform5′ flap3′ flapSplayed armActivity on splayed arm substrate ObservationTransient accumulation of dsDNA product, slowly degraded to smaller sizeNick at specific position from junction InferenceATP-independent DNA unwindingNicks two nucleotides away from the junction
None of the previously characterized Cas4 proteins have shown sequence-dependent cleavage in vitro (30, 31, 32). On similar lines, TON_0321 also showed sequence-nonspecific endonuclease activity in the present context. Cas4 proteins from type I-A, I-B, and I-C systems, without other adaptation proteins (Cas1 and Cas2), also show sequence-nonspecific nuclease activity in vivo (35, 38) suggesting that the interaction with Cas1 and Cas2 may be essential for PAM recognition–based activity (38). The type IV-C CRISPR cassette under study lacked Cas1 protein from the adaptation module. However, we cannot rule out the possibility of TON_0321 forming a complex with adaptation protein Cas2 in vivo, activating its sequence specificity, though under in vitro conditions, we did not obtain a stable complex between TON_0321 and Cas2.
Another significant difference between TON_0321 and other characterized Cas4 proteins lies in their oligomeric state. SSO0001 forms a toroidal structure from five dimers (30), SSO1391 exists as a dimer in solution (31), and Pcal_0546 exists as a monomer (31). Interestingly, TON_0321 is also monomeric in solution at several different concentrations tested (Fig. S17). SSO0001, because of its toroidal structure, has been proposed to serve as a sliding clamp for other Cas proteins (30). Superposition with the toroidal structure of SSO0001 suggests that the N terminus of TON_0321 is oriented away from the protein–protein interaction interface, making steric hindrance from an N-terminal tag during oligomerization unlikely based on the current model. However, this interpretation remains subject to the limitations inherent in structural modeling (Fig. S17B). Also, SSO0001 might serve as a powerful DNA-degrading machinery if its multiple active sites can function in unison (30). Since Cas4 proteins typically function in conjunction with other adaptation proteins, such as Cas1 and Cas2, as observed for monomeric Pcal_0546, it is plausible that TON_0321 also forms a complex with other adaptation proteins. Future studies could investigate whether TON_0321 participates in CRISPR-related processes, such as generating ssDNA overhangs in spacers, or explore its potential roles in non-CRISPR functions, such as DNA repair or broader nucleic acid metabolism (31).
Studying the uniqueness of TON_0321 has enhanced our understanding of the functional divergence of Cas4 proteins and expanded the landscape of nuclease activity hitherto assigned to Cas4 proteins. TON_0321 is a versatile nuclease enzyme capable of processing ssDNA and branched-chain substrates, suggesting its potential involvement in diverse cellular pathways requiring precise nucleic acid processing. TON_0321 shows endonuclease activity toward characteristic intermediates of DNA repair pathways, chromosome segregation mechanisms, and CRISPR systems. Like TON_0321, flap endonuclease 1 also exhibits 5′ to 3′ exonuclease and structure-selective endonuclease activities (71, 72). The structure-selective endonuclease activity of TON_0321 highlights its potential as a structure-dependent nuclease, with its ability to target branched DNA intermediates offering opportunities to complement existing CRISPR–Cas technologies and inspire the development of innovative genome editing tools. Furthermore, TON_0321 may play a significant role in plasmid propagation by enhancing recombination with other nucleic acids or maintaining plasmid mobility, warranting further investigation (66).
The unique properties of TON_0321 may also offer valuable insights for developing innovative genome editing tools, expanding the toolkit for genetic engineering and molecular biology applications. Future efforts to decouple its endonuclease and exonuclease functions, through targeted single or combinatorial mutagenesis combined with substrate-specific nuclease assays and stoichiometric binding analysis, could yield deeper mechanistic insights. Understanding this bifunctionality at the molecular level could have broader implications for manipulating Cas4-family enzymes in genome engineering and DNA repair contexts. In summary, adding upon the information available for Cas4 proteins, the current study delves into the characteristics of the TON_0321 Cas4 protein, located next to the type IV-C CRISPR cassette. Hereby, TON_0321 is a structure-selective endonuclease with 5′ to 3′ ssDNA–specific exonuclease activity.
Experimental procedures
Protein expression, purification, and mutagenesis
The genes encoding for TON_0321 protein and Cas2 protein (protein accession no.: TON_0320) were amplified from the genomic DNA of T. onnurineus NA1 using gene-specific primers (Table S1). The gene for TON_0321 was cloned in the pGEM-T cloning vector (Promega Corporation), and then subcloned in pET28a vector with an N-terminal 6X His tag (Novagen), and transformed into BL21-competent cells (Thermo Fisher Scientific). The gene for Cas2 was cloned into pGEX-4T vector. For the active site mutant of TON_0321, D96 and E105 amino acids were mutated to alanine using the Quick-change site-directed mutagenesis via KOD hot start DNA polymerase (Merck Millipore). For TON_0321 expression, transformed bacteria were grown till an absorbance at 600 nm reached ∼0.6 in LB media containing 50 μg ml^−1^ kanamycin at 37 °C, and then induced with 1 mM IPTG at 37 °C for 4 h. Bacteria were pelleted at 4000 rpm for 10 min at 4 °C. The expressed protein was purified using nickel–nitrilotriacetic acid (Ni–NTA) affinity chromatography. The bacterial pellet from the 2-l culture was resuspended in lysis buffer containing 25 mM Tris–HCl (pH 8.0), 10% glycerol, 500 mM NaCl, and 5 mM imidazole supplemented with PMSF and lysozyme. The suspension was incubated at 37 °C for 30 min, followed by sonication. The cell lysate was centrifuged at 10,000 rpm for 20 min, and the collected supernatant was loaded onto a pre-equilibrated Ni–NTA column (Bio-Rad). After binding (binding buffer: 25 mM Tris–HCl [pH 8.0], 10% glycerol, 500 mM NaCl, and 25 mM imidazole) for 2 h, the Ni–NTA beads were first washed with wash buffer containing 25 mM Tris–HCl (pH 8.0), 10% glycerol, 500 mM NaCl, and 50 mM imidazole and then eluted in 25 mM Tris–HCl (pH 8.0), 500 mM NaCl, 10% glycerol, and 300 mM imidazole. Fractions obtained were run on a 12% SDS-PAGE gel. The eluted fraction was then loaded on a HiLoad 16/60 S200 column (GE Healthcare) pre-equilibrated with 25 mM Tris–HCl (pH 8.0), 10% glycerol, and 500 mM NaCl. The peak obtained was confirmed for the presence of protein by running the fractions on a 12% SDS-PAGE gel. The protein fractions were stored in 50% glycerol at −20 °C. The active site mutant protein TON_0321^D96A/E105A^ was purified similarly.
For Cas2 expression, transformed bacteria were grown till an absorbance at 600 nm reached ∼0.6 in LB media containing 100 μg ml^−1^ ampicillin at 37 °C. Protein expression was induced with 0.4 mM IPTG, followed by overnight incubation at 16 °C. The culture was then centrifuged at 6000 rpm for 20 min at 4 °C to harvest the cells. For Cas2 purification, the cell pellets from the 2-l culture were resuspended in lysis buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 10% glycerol) containing lysozyme and PMSF and incubated on ice for 30 min. The suspension was then sonicated and centrifuged at 18,000 rpm for 30 min at 4 °C. The supernatant was loaded onto a GSTrap 4B column (Cytiva) and pre-equilibrated with wash buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 10% glycerol). Elution was performed using elution buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, 10% glycerol, and 10 mM glutathione) on an ÄKTA FPLC system. Fractions containing the protein of interest were concentrated using a 10 kDa Amicon (MERCK). For further purification, size-exclusion chromatography was performed by injecting the concentrated protein into a HiLoad 16/60 S200 column (GE Healthcare) equilibrated with buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 5% glycerol). The chromatogram peaks were analyzed on a 12% SDS-PAGE, and the fractions containing the pure protein of interest were concentrated and stored at −20 °C in 50% glycerol.
Nuclease assay
The HPLC-purified ssDNA oligonucleotides were purchased from SIGMA (Table S2A). The DNA substrates were prepared by annealing these ssDNA oligonucleotides in the combination mentioned in Table S2B. For annealing, different ssDNA oligonucleotides were mixed in equimolar ratio and heated at 95 °C for 5 min followed by slow cooling overnight. The substrate for the nuclease reaction comprised a mixture of 100 nM unlabeled substrate and 25 nM labeled substrate. The labeled substrate consisted of 6-FAM on the 5′-end of Y0-1 and Cy5 on the 3′-end of Y0-4. The reaction mixture without protein, consisting of 50 mM Tris–HCl (pH 8.0), 2.5 mM MgCl_2_, 1.0 mM DTT, 125 mM NaCl, and 0.1 mg/ml bovine serum albumin (BSA), was incubated at 35 °C for 15 min. To initiate the reaction, 25 nM of protein was added. Aliquots were taken out at different time points (0, 2.5, 5, 10, 20, and 40 min), and the reaction was quenched using 5 mM EDTA, 2 mg/ml proteinase K, and 0.2% SDS treatment of reaction aliquots. The samples were resolved on 18% TBE–urea PAGE and scanned using a Typhoon scanner (GE Healthcare). The quantification of bands (substrate and products) was performed using Amersham ImageQuant software, involving lane creation, background subtraction, band detection, and relative intensity quantification. The samples for urea PAGE were prepared by heating the reaction mix at 95 °C for 10 min in formamide dye. Different DNA oligonucleotides were used to generate a DNA ladder (Table S2C). All experiments were done in triplicate.
Cruciform assay
The cruciform assay was performed using plasmid pIRbke8^mut^ (57, 58). The cruciform substrate (pIRbke8^mut^) is a supercoiled DNA plasmid without free ends that contains secondary structure elements. Plasmid pIRbke8^mut^ (substrate 1) includes an inverted repeat sequence, which, upon incubation at 37 °C, forms a cruciform-like structure resembling a four-way junction. This cruciform-like structure was eliminated through site-directed mutagenesis (primers used are mentioned in Table S1), resulting in a double-stranded, supercoiled plasmid without free ends or secondary structure (substrate 2). The plasmids were transformed into DH5α-competent cells, and positive transformants were selected on LB plates containing ampicillin. A primary culture was set and harvested when an absorbance at 600 nm reached 0.5 (midlog phase). The plasmid was isolated from the culture pellet using a Qiagen midiprep plasmid isolation kit. For each experiment, both the plasmids were diluted to 10 nM with water. The reaction was performed with 2 nM of plasmid DNA in a buffer containing 50 mM Tris–HCl (pH 8.0), 2.5 mM MgCl_2_, 1.0 mM DTT, 125 mM NaCl, and 0.1 mg/ml BSA incubated at 37 °C for 30 min to induce cruciform extrusion. To initiate the reaction, 25 nM of protein was added, and the reaction was carried out for 40 min. Aliquots were taken out at different time points (0, 2.5, 5, 10, 20, and 40 min). EcoRI, T7 endonuclease I, and Nt.BspQ1 enzymes from New England Biolabs were used as positive controls, and the reactions were set for 40 min. The reactions were quenched using 5 mM EDTA, 2 mg/ml proteinase K, and 0.2% SDS. Products were run on 0.8% agarose gel. The gel was stained with ethidium bromide (0.5 μg/ml in 1X TBE), followed by destaining using water, and visualized using Gel Doc XR+ System (Bio-Rad). The quantification of bands (substrate and products) was performed using Image Lab (Bio-Rad) software. All experiments were done in triplicate. Figure 3, B created in BioRender. Gaur, V. (2025) https://BioRender.com/sj9qxon and Figure S15A created in BioRender. Gaur, V. (2025) https://BioRender.com/rj3p8sh.
Fluorescence anisotropy
To study the binding of TON_0321 protein with various DNA substrates (Table S3), fluorescence anisotropy was used. DNA substrates were labeled with 6-FAM on the 5′-end of Y0-1. The DNA substrate mixture (35 nM) consisting of 10 nM unlabeled DNA and 25 nM of labeled DNA was used. Protein concentration varied in the range of 0 to 1600 nM. For DNA substrate–binding studies, since we believe metal ions can also contribute to DNA binding, instead of using alanine mutants, we suppressed catalytic activity by using Zn^2+^ ions in the DNA substrate–binding studies. The reaction was set up in a buffer containing 20 mM Tris–HCl (pH 8.0), 100 mM NaCl, 0.5 mM DTT, 0.1 mg/ml BSA, and 20 mM ZnSO_4_ at 25 °C in 96-well flat-bottom black polystyrene plates (BRAND). Anisotropy was measured using a POLARstar Omega plate reader at an excitation wavelength of 485 nm and an emission wavelength of 520 nm. Fluorescence anisotropy was calculated as (I_||_ – I_⊥)/(I||_ + 2I_⊥), where I||_ and I_⊥_ are intensities in parallel and perpendicular directions. Binding was studied as the change in anisotropy (A-A_0_) versus protein concentration, where A is the observed anisotropy and A_0_ is the anisotropy of the substrate alone. All experiments were done in triplicate.
Oligomeric state determination
To determine the oligomeric state of TON_0321, size-exclusion chromatography was used. HiLoad 16/60 S200 column (GE Healthcare) pre-equilibrated with 25 mM Tri–HCl (pH 8.0), 10% glycerol, and 500 mM NaCl. The column was calibrated using gel filtration markers (Bio-Rad; catalog no.: 151-1901). A standard curve was generated for the gel filtration markers (thyroglobulin, vitamin B12, myoglobin, ovalbumin, and gamma globulin). Kav was calculated as (Ve – Vo)/(Vt – Vo), where Ve, Vo, and Vt are elution volume, void volume, and total volume, respectively. The column′s void volume was determined using thyroglobulin (670 kDa).
In vitro double pulldown assay
Each TON_0321 (75 μg) and Cas2 (75 μg) were mixed and incubated with 100 μl of glutathione Sepharose 4B beads for 2 h at 4 °C (washed and preincubated with 25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 5% glycerol). After incubation, the protein–bead mixture was centrifuged using a Corning Costar Spin-X centrifuge tube filter at 6000 rpm for 2 min at 4 °C, and the flow-through was collected. The beads were then washed 10 times with wash buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 5% glycerol). Proteins were eluted from the beads using an elution buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, 5% glycerol, and 10 mM glutathione). The eluted protein was incubated with 100 μl of Ni–NTA agarose beads (QIAGEN) (washed and preincubated with 25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 5% glycerol) for 2 h at 4 °C with gentle shaking. Following incubation, the protein–bead mixture was centrifuged again using a Corning Costar Spin-X centrifuge tube filter at 6000 rpm for 2 min at 4 °C, and the flow-through was collected. The beads were extensively washed 10 times with wash buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, and 5% glycerol). Finally, proteins were eluted by adding elution buffer (25 mM Tris–HCl [pH 8.0], 150 mM NaCl, 5% glycerol, and 500 mM imidazole). The flow-through, wash, and elution fractions were analyzed on Bio-Rad 4% to 20% Mini-PROTEAN TGX Precast Protein gels stained with Coomassie Brilliant Blue.
Data availability
All the data described are present in the article or the supporting information.
Supporting information
This article contains supporting information (39, 73, 74, 75).
Conflict of interest
The authors declare that they have no conflicts of interest with the contents of this article.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bigini F.Lee S.H.Sun Y.J.Sun Y.Mahajan V.B.Unleashing the potential of CRISPR multiplexing: harnessing Cas 12 and Cas 13 for precise gene modulation in eye diseases Vis. Res 213202310831710.1016/j.visres.2023.108317 PMC 1068591137722240 · doi ↗ · pubmed ↗
- 2Li L.Liu W.Zhang H.Cai Q.Wen D.Du J.A new method for programmable RNA editing using CRISPR effector Cas 13X. 1Tohoku J. Exp. Med.260202351613682318510.1620/tjem.2023.J 011 · doi ↗ · pubmed ↗
- 3Sen D.Mukhopadhyay P.Antimicrobial resistance (AMR) management using CRISPR-cas based genome editing Gene Genome Editing 72024100031
- 4Zhang X.Wang X.Lv J.Huang H.Wang J.Zhuo M.Engineered circular guide RN As boost CRISPR/Cas 12a-and CRISPR/Cas 13d-based DNA and RNA editing Genome Biol.2420231183735384010.1186/s 13059-023-02992-z PMC 10288759 · doi ↗ · pubmed ↗
- 5Yang Y.Wang D.LüP.Ma S.Chen K.Research progress on nucleic acid detection and genome editing of CRISPR/Cas 12 system Mol. Biol. Rep.502023372337383664869610.1007/s 11033-023-08240-8PMC 9843688 · doi ↗ · pubmed ↗
- 6Ma E.Chen K.Shi H.Stahl E.C.Adler B.Trinidad M.Improved genome editing by an engineered CRISPR-Cas 12a Nucleic Acids Res.50202212689127013653725110.1093/nar/gkac 1192 PMC 9825149 · doi ↗ · pubmed ↗
- 7Li Z.-H.Wang J.Xu J.-P.Wang J.Yang X.Recent advances in CRISPR-based genome editing technology and its applications in cardiovascular research Mil. Med. Res.102023123689506410.1186/s 40779-023-00447-x PMC 9999643 · doi ↗ · pubmed ↗
- 8Wang S.-W.Gao C.Zheng Y.-M.Yi L.Lu J.-C.Huang X.-Y.Current applications and future perspective of CRISPR/Cas 9 gene editing in cancer Mol. Cancer 212022573518991010.1186/s 12943-022-01518-8PMC 8862238 · doi ↗ · pubmed ↗
