LysR-type regulator LrhA promotes CRISPR-Cas immunity in Escherichia coli
Mengdie Fang, Jim Yap, Mingyue Fei, Mengxin Gong, Na Li, Yalan Lu, Mingjing Yu, Yuanyou Xu, Fabai Wu, Haichun Gao, Dongchang Sun

TL;DR
A new regulator called LrhA helps control CRISPR-Cas immunity in E. coli by adjusting gene activity to fight viruses and plasmids.
Contribution
LrhA is identified as a novel CRISPR-Cas activator that modulates adaptive immunity through differential regulation of cas gene transcription.
Findings
LrhA enhances CRISPR-Cas immunity by promoting cas gene transcription in high-expression strains.
Moderate LrhA activity accelerates plasmid clearance through enhanced spacer acquisition.
Optimal adaptive immunity is achieved with intermediate cas gene transcription via positive feedback.
Abstract
The CRISPR-Cas defense system safeguards prokaryotes against foreign genetic elements. Its activity is determined by the combined effects of adaptation and interference. However, the dynamic regulation of these two processes remains not fully understood. In this study, we identify the LysR-type transcriptional regulator LrhA, which is differentially expressed in various Escherichia coli strains, as a novel CRISPR-Cas activator that plays a critical role in modulating host defense levels. In a representative strain expressing a high level of LrhA, the regulator enhances CRISPR-Cas-mediated adaptive immunity against bacteriophage infection by promoting cas gene transcription through direct interaction with the promoter of the cas operon. Moderate activation of cas genes by weakly expressed LrhA in another representative strain efficiently accelerates the clearance of horizontally…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6- —National Natural Science Foundation of China10.13039/501100001809
- —Key Research and Development Program of Zhejiang Province10.13039/100022963
- —Natural Science Foundation of China10.13039/501100001809
- —Zhejiang Lingyan Research and Development Program
- —Joint Funds of the Zhejiang Provincial Natural Science Foundation of China
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCRISPR and Genetic Engineering · Bacterial Genetics and Biotechnology · Vibrio bacteria research studies
Introduction
Prokaryotes have evolved diverse defense systems against plasmids and viruses [1]. Among them, the CRISPR-Cas adaptive immune system provides bacteria and archaea with acquired and inheritable immunity against previously encountered invading nucleic acids [2]. The CRISPR-Cas system acts in three distinct stages. During the first contact with a foreign genetic element, short DNA fragments from this element are incorporated into the CRISPR array, which contains the genetic memory of previously infecting elements. This process is termed adaptation [3]. Under certain conditions, Cas proteins are expressed, accompanied by the transcription of the CRISPR array, and subsequent processing of the premature CRISPR RNA (pre-crRNA) into mature crRNAs [4]. In the final stage termed interference, crRNAs guide Cas proteins to the target sequence, where the Cas effectors are recruited for cleaving the previously encountered foreign DNA [5]. Native CRISPR-Cas systems have been engineered for broad applications, such as genome editing tools [6, 7] and antimicrobials for combating multi-drug resistant bacteria [8–10].
CRISPR-Cas adaptive immunity represents a powerful weapon of defense in prokaryotes. On the other hand, invasive nucleic acids often rapidly evolve resistance to CRISPR immunity by mutating the CRISPR-targeted region [11]. To fight back, a process named “primed adaptation” is provoked to rapidly incorporate new spacers, derived from DNA fragments produced by interference, for upgrading the CRISPR-Cas adaptive immunity [12–14]. Primed adaptation requires Cas proteins for assembly of the interference and the adaptation complexes [12–14]. Nevertheless, a chronic high-level of CRISPR-Cas adaptive immunity increases the burden of producing Cas proteins and crRNAs, as well as the risk of suicide from autoimmunity [15, 16]. Although the working mechanisms of CRISPR adaptation and interference have been intensively investigated [3, 12, 14, 17], little is known about how the CRISPR-Cas adaptive immunity is adjusted by the host factors when facing different types of vulnerabilities [18, 19].
The type I CRISPR-Cas system, comprised of the multi-subunit RNA-guided surveillance complex Cascade, the Cas3 nuclease, and the Cas1–Cas2 integrase, is most prevalent in prokaryotes [20]. Regulation studies have focused on type I CRISPR-Cas systems [18, 21–28], especially the type I-E CRISPR-Cas system of Escherichia coli, in which CRISPR interference is regulated by the LysR-type transcriptional regulator (LTTR) LeuO [29, 30], the histone-like proteins H-NS and StpA [31–33], and the carbon catabolite regulator CRP [34, 35]. Nevertheless, regulation of CRISPR adaptation has rarely been investigated in E. coli. We hypothesized the presence of host factor(s) that are involved in the coordination of CRISPR adaptation and interference, and effectively aid the bacteria to counter foreign DNA under certain scenarios.
To search host factor(s) that control the CRISPR-Cas adaptive immunity in E. coli, we screened for new regulators involved in regulating transcription of cas genes by combining DNA pull-down and mass spectrometry, and identified the LTTR protein LrhA as an important activator, which directly bound to the pomoter of the cas operon, and stimulated the CRISPR-Cas adaptive immunity against bacteriophage infection. By prromoting a positive feedback circuit between interference and adaptation, LrhA remarkably augmented the clearance of CRISPR-targeted plasmid, via moderately stimulating the transcription of cas genes required for both adaptation and interference. Our work uncovers how a host-encoded transcription factor can directly activate CRISPR-Cas functionality, gaining new insights into the regulation of bacterial defence against horizontally transferred plasmids.
Materials and methods
Bacterial strains, plasmids, oligonucleotides, growth conditions, and media
Bacterial strains, plasmids, and oligonucleotides used in this study are listed in Supplementary Tables S5–S7. Escherichia coli gene-deletion mutants were constructed by using a λ-RED recombination system expressed by the temperature-sensitive plasmid pKD46 [36]. Transformants were selected using plates with appropriate antibiotics. The resistance gene was eliminated by introducing the temperature-sensitive plasmid pCP20 into the mutant strain. Corresponding mutants were examined through polymerase chain reaction (PCR) with the primer pairs listed in Supplementary Table S7 (Genotyping). LrhA has been reported to inhibit motility through repressing the flagellar flhDC operon [37]. Accordingly, the ΔlrhA mutants were further examined for motility-related phenotypes. Recombinant plasmids expressing LrhA and 10 other transcriptional factors (TFs) were constructed with a Transfer PCR method [38]. Plasmids were isolated and purified with a plasmid isolation kit according to the manufacturer’s protocol (Axygen Biotech Co., Ltd.). Escherichia coli cultures were grown at 37°C in LB broth medium containing 1% (wt/vol) tryptone, 0.5% (wt/vol) yeast extract, and 1% (wt/vol) NaCl, or on LB agar plates containing 1.5% (wt/vol) agar. When required, media were supplemented with ampicillin (100 μg ml^−1^), kanamycin (50 μg ml^−1^), or chloramphenicol (25 μg ml^−1^). Arabinose was added to induce gene expression in liquid cultures where indicated. Cell growth was measured in a spectrophotometer (MD SpectraMax iD5 or Biotek Synergy 2). All experiments were repeated independently at least three times unless otherwise indicated.
Pull-down assay
To prepare samples for mass spectrometric analysis of proteins binding to Pcas, 8 pmol biotinylated P_cas_ DNA fragment was coated with 100 μl streptavidin beads (Invitrogen Dynabeads™ M-270, 65305) in the binding buffer A [10 mM Tris–HCl, pH 7.5, 1 mM ethylenediaminetetraacetic acid (EDTA), 2 M NaCl] for 1 h at room temperature. Biocytin, a derivative of biotin, was used as the negative control. From a 5 ml culture of E. coli hns deletion-mutant grown overnight at 37°C, proteins were extracted by sonication in the binding buffer B (10 mM Tris–HCl, pH 7.5, 1 mM EDTA, 50 mM NaCl, 0.4 mM dithiothreitol and 10% glycerol), supplemented with protease inhibitor cocktail (Roche Co., Ltd.). The crude lysates were centrifuged at 15 000 × g for 10 min. The supernatant was collected, pre-cleared by empty streptavidin beads, and then incubated with 50 μl Pcas DNA-coated beads overnight at 4°C under constant rotation (20 rpm). Beads were washed five times with the binding buffer B, followed by washing three times with the binding buffer B supplemented with 1 mg salmon sperm DNA, and finally washing five times with the binding buffer B to remove non-specific proteins. DNA-binding proteins were eluted in sodium dodecyl sulfate (SDS) sample loading buffer and loaded onto 10% SDS–polyacrylamide gel electrophoresis (PAGE), followed by the mass-spectrometric analysis of the gel bands as described below.
Mass spectrometry analysis
Mass spectrometry analysis was performed by the Micrometer Biotech Company (Hangzhou, China). After 37°C overnight in-gel digestion of proteins binding to P_cas_ with trypsin, peptides were loaded onto a home-made reversed-phase analytical column (15 cm length, 75 μm i.d.) and analyzed with a gradient from 5% to 34% solvent B (0.1% formic acid in 80% acetonitrile) over 40 min, 34%–38% solvent B for 5 min, 38%–90% solvent B for 10 min, and then holding at 90% solvent B for 10 min, at a constant flow rate of 300 nl/min on an EASY-nLC 1000 UPLC system. Peptides were subjected to NSI source followed by tandem mass spectrometry (MS/MS) in Thermo Scientific Orbitrap Fusion coupled online to the UPLC. Survey full-scan MS spectra (m/z 350–1800) were acquired in the Orbitrap with resolution of 70 000 at m/z 400. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the iProX partner repository with the dataset identifier PXD036879.
Western blot
DNA-binding proteins and the total cell extracts were added with SDS sample loading buffer and boiled for 5 min at 95°C, then 20 μl of the sample was loaded onto a 12% Bio-Rad stain-free SDS–PAGE gel (Bio-Rad, 1610185) and blotted onto PVDF membranes using the semi-dry transfer apparatus (Genscript). Protein on the membrane was visualized by a stain-free blot module in a ChemDoc™ XRS^+^ system. The membrane was blocked and incubated overnight with primary antibody (mouse anti-Flag Ab, Cell Signaling Technology, Inc, 8146; mouse anti-His Ab, Proteintech Group, Inc, 66005) at 4°C, followed by horseradish peroxidase-conjugated secondary antibody (Cell Signaling Technology, 7076) for 1 h at room temperature. After washing, the immune complexes were detected by the ECL Western Blotting (Pierce, 32209). The image was visualized in the ChemDoc™ XRS + system with Image Lab^TM^ Software (Bio-Rad).
Measurement of promoter activity
For quantifying P_cas_ or P_lrhA_ with reporter assay, cells carrying the reporter plasmid pGLO-Pcas-gfp or pGLO- P_lrhA-gfp were grown in M9 minimal medium containing 61 mM K_2_HPO_4·3H_2_O, 38.21 mM KH_2_PO_4_, 2.5 mM MgSO_4_, 15.14 mM (NH_4_)2_SO_4, 0.1% (wt/vol) tryptone, supplemented with 1% (wt/vol) glucose as the carbon source. pGLO-Pcas-gfp was used for monitoring the transcription of the cas operon, by measuring the intensity of GFP fluorescence (excitation/emission wavelengths of 395/509 nm) and cell optical density (OD_600_) on a microplate reader (MD, SpectraMax iD5), with a method that has previously been described [33].
Quantification of gene transcription
For the quantitative real-time PCR (qPCR) assay, bacteria were harvested from 0.4 to 1 ml culture by centrifugation. Total RNA was extracted by using an RNA Extraction Kit (AG21017, Accurate Biotechnology Co., Ltd). Removal of the genomic DNA and reverse transcription were performed by using Evo M-MLV RT Mix Kit (with genome DNA clean reaction mix) (AG11728) according to the manufacturer’s instructions. SYBR Green Premix Pro Taq HS qPCR Kit (AG11701) was used for qPCR, and the cycle thresholds were determined using a BioRad CFX96 Real-Time System. rpoD was used as the internal control. The primers for cse1 and rpoD are listed in Supplementary Table S7.
Protein–DNA model prediction
Two LeuO–DNA models were predicted using ChaiDiscovery [39] (https://lab.chaidiscovery.com/). Residues 1–7 of LeuO in LeuO-LBS I model were excluded (ipTM 0.51, pTM 0.61) due to low confidence, and the full-length LeuO protein was retained in LeuO-LBS II model (ipTM 0.51, pTM 0.60). LrhA-Pcas model was predicted using AlphaFold 3 [40] (https://golgi.sandbox.google.com/).
A structural model of the RNAP–Pcas DNA–LrhA complex was also generated using AlphaFold3. To ensure accurate placement of the LrhA tetramer on promoter DNA, the downstream DNA region in the input was replaced with a sequence corresponding to a canonical LBS II-like binding motif prior to prediction. The highest-confidence AlphaFold3 output (pTM 0.68, ipTM 0.63) was retained. A complete RNAP holoenzyme model was then predicted separately using AlphaFold3 (pTM 0.83, ipTM 0.80), aligned to the RNAP portion of the initial RNAP-Pcas DNA–LrhA model using PyMOL, and combined with the native −60 to +20 promoter DNA sequence to produce the final RNAP–Pcas DNA–LrhA complex used for MD simulations. A corresponding RNAP-Pcas DNA control model lacking LrhA was generated by removing the LrhA tetramer and trimming terminal DNA residues to match the reduced complex length.
The models are available in ModelArchive (www.modelarchive.org) with the accession codes and password listed in Supplementary Table S8.
MD simulations
Molecular dynamics (MD) simulations were conducted using GROMACS-v2024 [41]. The system was solvated in a rectangular water box of TIP3 water, which extended 1.0 nm beyond the complex in any dimension, and neutralized with 150 mM NaCl. Before MD production, an energy minimization and the equilibration in the NVT ensemble at a temperature of 310 K using mdp files from CHARMM-GUI were executed sequentially to equilibrate the simulation box. A series of MD simulations were conducted in the NPT ensemble at a temperature of 310 K and a pressure of 1 bar for each system. Temperature and pressure were coupled using the velocity-rescale method (time constant of 1 ps) and isotropic pressure coupling with the Parrinello-Rahman algorithm (time constant of 5 ps), respectively. For the LeuO-LBS I and LeuO-LBS II models, NPT productions were run for 100 ns. The LrhA-LBS I-LBS II model, or the + 40,+42 mutant underwent an initial 100 ns NPT production, followed by three independent 500 ns parallel NPT productions to ensure robust sampling of structural dynamics. For the RNAP-Pcas DNA-LrhA and RNAP-Pcas DNA control systems, three independent 100 ns NPT productions were performed for each model using the same simulation pipeline.
MD analysis
The Root-mean-square deviation (RMSD) was analyzed using GROMACS-v2024. Residue–residue and residue–base contact maps were generated to illustrate the interaction frequency of specific residue pairs between the interfaces of LrhA tetramer. We define a contact to be formed if: (i) the distance between the closest atoms of residues (base) is shorter than 4 Å, (ii) the residues (base) are positioned at the interface, with each residue in a pair originating from a different monomer, and (iii) the scores represent the average contact frequencies calculated from three independent NPT production simulations, with the final plots generated using the average scores. The residues (base) were identified using MDAnalysis [42], and the minimum distance between residue (base) pairs was calculated using PLUMED [43]. From this contact map, residue–base pairs with interaction frequencies above 90% were selected and considered as stably contacting pairs. Hydrogen bonds were then analyzed for these selected pairs using the gmx-hbond tool in GROMACS [44]. For each residue–base pair, a hydrogen bond count between 0.8 and 1.6 was interpreted as the stable formation of one hydrogen bond, while a count between 1.6 and 2.0 indicated the presence of two stable hydrogen bonds. Binding free energy calculations were performed using the gmx_MMPBSA tool [45] on the processed trajectory (500 frames with 1 ns per frame), using an interval of 4 frames (i.e. one frame analyzed every 4 ns).
For the RNAP-Pcas DNA–LrhA and RNAP-Pcas DNA control systems, MMPBSA calculations were performed over the 50–100 ns segment of each 100 ns production run. Because the trajectories were saved at 1 ns per frame, this window corresponded to 50 frames, and energies were evaluated at an interval of two frames. For the LrhA–DNA complexes, including contact frequency, hydrogen bond detection, and binding free energy estimation, were performed on the final 200 ns of simulation trajectories (301–500 ns) to ensure sampling from the stable production phase. All structural models and interaction visualizations, including contact maps, molecular interactions, and alignments, were generated and processed using PyMOL or UCSF Chimera [46]. Conservation analysis was performed using the Consurf [47] server with default parameters, based on the UniProt id P36771.
Purification of LrhA protein
The lrhA gene was amplified and cloned into the pET28a vector to generate pET28a-His-lrhA, which encodes the LrhA protein fused to an N-terminal His-tag. The pET28a-His-lrhA plasmid was transformed into E. coli BL21 (DE3). The overnight culture was inoculated into 100 ml LB medium at a ratio of 1:100. When the culture was grown at 37°C to an OD_600_ of 0.4, expression of LrhA was induced with 0.5 mM IPTG (Beyotime, ST098) at 37°C for 2 h. Cells were centrifuged at 8000 rpm for 5 min at 4°C. The cell pellet was resuspended in phosphate buffered saline (PBS) containing protease inhibitor and sonicated, and the lysate was centrifuged at 12 000 × g for 5 min at 4°C. His-LrhA protein was purified by Ni-NTA Resin (Thermo Fisher, 88222) in a gravity-flow column, and the bound protein was eluted with PBS containing 250 mM imidazole. The eluted protein was analyzed by SDS–PAGE, and dialyzed at 4°C in PBS with a 10 kDa dialysis tube (Millipore, UFC801096). Protein concentrations were determined by the Bradford assay (Beyotime, P0006C).
Electrophoretic mobility shift assay
Pcas (5 ng) probes were mixed with purified LrhA protein of different concentrations in the presence of binding buffer (10 mM Tris–HCl, pH 7.5, 50 mM KCl, 1 mM dithiothreitol, 2.5% glycerol, 1 mM EDTA). Pcas probes and their truncated or mutated probes were generated by PCR with primers P126 (Pcas-F) and P127 (Pcas-R). A Ctrl probe generated by PCR from the lacZα gene with primers P69 and P70 was used as a control. The mixture was incubated at room temperature for 20 min, then separated by electrophoresis in 5% native polyacrylamide gels in 0.5 × Tris-borate-EDTA buffer. Bands were transferred to PVDF membrane using the semi-dry transfer apparatus (Genscript) and visualized by Chemiluminescent Nucleic Acid Detection Module (Thermo Fisher, 89880).
Plasmid loss assay
The E. coli strains harboring the CRISPR-targeted plasmid pT (pdsRED-CR1) or the CRISPR-non-targeted plasmid pNT (pdsRED) were grown overnight in LB culture supplemented with ampicillin (100 μg ml^−1^). The overnight-grown culture was washed and inoculated at a ratio of 2% into antibiotic-free LB medium, followed by incubation at 37°C with shaking. At intervals, the culture was serially diluted and spread on the LB plate supplemented with or without ampicillin (100 μg ml^−1^). The number of cells that maintained the plasmid was evaluated by comparing colonies on the LB plate supplemented with or without ampicillin. The relative CRISPR immunity was defined as the proportion of cells containing the pNT plasmid relative to those containing the pT plasmid.
Spacer acquisition assay
For experiments presented in Fig. 6F–G, E. coli strains carrying pT or pNT were grown overnight in LB media supplemented with ampicillin (100 μg ml^−1^), then diluted to 1:50 times for further 36 h growth at 37°C with shaking.
For experiments presented in Supplementary Fig. S12, strains were grown in fresh LB with ampicillin (100 μg ml^−1^) (for pT plasmid maintenance) for 24 h and diluted 1:500 in fresh LB with ampicillin, and grown for an additional 24 h. Passaging was performed for 3 days.
To monitor spacer acquisition, 200 μl of the cultures were collected by precipitation, washed thrice, and then resuspended in distilled water. These cells were used as the template for PCR amplification of the CRISPR array I by using the primer pair P130 (Adaptation-F) and P131 (Adaptation-R) as shown in Fig. 6E. All PCR products were run at 150 volts on a 2% agarose gels with GelStain (TransGen Biotech, GS101) in Tris-Acetate-EDTA buffer to identify parental and expanded arrays (parental array + 61 bp). For sequencing newly acquired spacers, the expanded array bands were excised, and corresponding DNA fragments were purified using a PCR Clean-up kit (MN 740609.50), and ligated into T vectors (Takara, 6013).
M13 phage sensitivity assay
Phage infection assay was conducted with a method described previously [48] with slight modifications as follows. Escherichia coli ER2738 strains with or without chromosomal g8 spacers [12, 17] were grown in LB medium supplemented with M13 phage at 37°C. Supernatants were collected at indicated time points to determine the time course of phage titer via double-agar overlays. Briefly, the supernatants were diluted tenfold, then mixed with 100 μl ER2738 (OD_600_ ∼ 0.5). The mixture was added into 1 ml 0.75% molten LB top agar and poured on LB agar (1.5%) plates containing 0.2 mM IPTG (isopropyl-β-D-thiogalactoside) and 40 μg ml^−1^ X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside) for visualization of plaques formed by M13 phage which carries the lacZα gene. After incubation at 37°C overnight, the plaque-forming units (PFU) on plates were counted.
Gene Ontology analysis
For Gene Ontology analysis, the list of proteins was uploaded to the Gene Ontology database search portal (http://geneontology.org/) on molecular function to retrieve the plotted terms with their corresponding P values (Fisher’s exact test with no correction). The raw P-value was determined by Fisher’s exact test. The False Discovery Rate (FDR) was calculated by the Benjamini–Hochberg procedure. Only results for FDR and P < .05 were displayed.
Motility test of E. coli lrhA+ and lrhA− strains
Tryptone swarm plates (1% tryptone, 0.5% NaCl, and 0.3% Bacto agar) were spotted with 10 μl overnight culture of the indicated strains, and incubated for 4 h at 37°C, followed by overnight incubation at room temperature.
Conjugation interference assays
To assess type I-E CRISPR-Cas immunity by conjugation assay, a conjugative plasmid pTc, which carries a DNA fragment with four protospacer-adjacent motifs (PAM)-containing spacers from pT [33], was used for evaluating CRISPR immunity during plasmid transfer. CRISPR-targeted conjugation plasmid (pTc) or CRISPR-non-targeted conjugation plasmid (pNTc) were transferred from E. coli WM3064 donor cells to indicated recipient strains through conjugation. The frequency of plasmid conjugation was calculated as the ratio of transconjugants per total recipients. The relative CRISPR immunity was defined as the conjugation frequency of pNTc plasmid relative to that of pTc plasmid.
Bacterial adenylate cyclase-based two-hybrid assays
The method of bacterial adenylate cyclase-based two-hybrid (BACTH) assay was performed as described previously [49]. Briefly, the protein was fused with T18 or T25 and the recombinant plasmids were co-transformed into E. coli BTH101 cells. Transformants were grown in LB supplemented with IPTG (0.5 mM), ampicillin (100 μg/ml), and kanamycin (50 μg/ml) at 30°C with shaking for 36 h. The culture (2.5 μl) was dropped onto LB plates supplemented with kanamycin (50 μg/ml), ampicillin (100 μg/ml), IPTG (0.5 mM) and X-Gal (40 μg/ml) and incubated at 30°C for 12 h.
Gel filtration chromatography
Gel filtration chromatography was performed using a Superdex Hiload 200 10/600 GL column and AKTA fast protein liquid chromatography system (GE Healthcare, Germany) in 20 mM Tris–HCl, pH 8.0, 300 mM NaCl at a flow rate of 1 ml/min. The calibration curve was obtained with IgG (158 kDa), albumin (66 kDa), and ovalbumin (44 kDa). The LBS II (59 bp) and LBS I (58 bp) DNA fragmentand were generated by annealing of oligonucleotides at a concentration of 5 μM in 10 mM Tris–HCl, pH 8.0, 1 mM EDTA, pH 8.0, 100 mM NaCl. To generate DNA-bound complexes of His-LrhA and LBS I/II, 5.7 μM (∼1 mg) of the proteins (concentration of the monomer) were incubated with 112 nM of the LBS I or LBS II for 1 h at room temperature in a total volume of 5 ml. Then, the LrhA protein–DNA complexes were subjected to gel filtration chromatography.
Comparative genomics and phylogenetic reconstruction
Genomes of 48 E. coli strains sequenced by the Chinese Academy of Agricultural Sciences were downloaded from NCBI (Bioproject: PRJNA1145286). A species phylogeny was reconstructed following the classical E. coli multilocus sequence typing scheme. Seven housekeeping genes, adk, fumC, gyrB, icd, mdh, purA, and recA, were extracted from each genome using BLASTN against the MG1655 reference alleles. These loci were individually aligned using MAFFT v7 [50] with the L-INS-i algorithm and subsequently concatenated into a single supermatrix for phylogenetic inference. The maximum-likelihood tree was inferred with IQ-TREE v2 [51] using automatic model selection and 1000 ultrafast bootstrap replicates to assess branch support. Presence or absence of lrhA, leuO, and hns was assessed using DIAMOND BLASTP [52] (v2.0) against the MG1655 reference proteins, using ≥70% sequence identity and ≥90% alignment coverage as detection thresholds. CRISPR-Cas loci were identified and classified using CRISPRCasTyper [53] with default parameters. The final phylogenetic tree, together with presence/absence matrices for lrhA, leuO, hns, and CRISPR-Cas subtype assignments, was visualized and annotated using the tvBOT module of the ChiPlot web server [54].
DNA bending angle analysis
DNA bending angles were calculated using a three-block center-of-mass (COM) definition [55]. For LeuO-DNA and LrhA-DNA simulations, the upstream, middle, and downstream blocks were defined as described above. For the RNAP-Pcas DNA-LrhA and RNAP-Pcas DNA systems, bending angles were computed over the −60 to +20 promoter region. The upstream COM was calculated from phosphorus atoms in −60 to −40, the middle COM from the first re-annealed base pair immediately downstream of the melted bubble, and the downstream COM from +10 to +20. The bending angle was defined as the geometric angle formed by UP-MID-DOWN COM points, following a coarse-grained three-block scheme analogous to the approach of Sharma et al. [55].
Statistical analysis
The statistical details of experiments were provided in the figure legends. Statistics were performed by using GraphPad Prism. To determine statistical significance, two-tailed Student’s t-test was performed for pairwise comparisons, and one-way analysis of variance (ANOVA) with Dunnett’s multiple comparisons test was utilized for comparisons between multiple groups. Values are reported as mean ± SD.
Results
Screening for proteins binding to the promoter of cas genes
Expression of the cas operon is regulated by H-NS and LeuO, which bind to specific sites within Pcas [29, 31] (Fig. 1A and Supplementary Fig. S1A). To screen for new regulators that potentially affect the transcription of cas genes, we performed a DNA pull-down assay using a DNA probe containing Pcas as a bait (Fig. 1A). Because H-NS binds to high-affinity sites within Pcas and possibly spreads along the adjacent DNA [56], potentially masking the binding sites of other regulators, we screened for proteins binding to Pcas in hns-deletion mutants of two E. coli strains (MC4100 and BW25113). Mass spectrometry analysis showed that >100 proteins in each strain were isolated with the biotinylated Pcas (Supplementary Fig. S1B). Among them, 49 proteins were detected in both strains (Supplementary Fig. S1B). According to the bioinformatic analysis of proteins with high coverage of matched peptides, 18 proteins were classified as TFs (Fig. 1B, Supplementary Fig. S1C, and Supplementary Tables S1 and S2). Two previously documented regulators of the CRISPR-Cas system (i.e. StpA [32, 33] and SlyA [57]) were identified in the protein mixture bound to Pcas, indicating that the screening approach was effective. Remarkably, the LTTR LrhA was highly abundant in protein mixtures bound to Pcas in both MC4100 and BW25113 strains (Fig. 1B and C). Subsequent pull-down assay confirmed that purified LrhA bound Pcas, suggesting that LrhA could be a potential regulator of the CRISPR-Cas system in E. coli (Fig. 1D and Supplementary Fig. S1D).
*Screen for new regulators modulating transcription of cas genes. (A) Schematic illustration of the CRISPR-Cas locus in E. coli. Repeats (diamonds) in the CRISPR array are interspersed with 12 unique spacer sequences (rectangles). The promoter of the cas operon (Pcas) is indicated by the black arrow. The Pcas DNA probe for pull-down assay contains a transcription start site (TSS) (red letter), LeuO binding site I and II (LBS I/II), and H-NS binding site (HBS). For the pull-down proteomic experiment, a biotinylated probe (biotin-Pcas DNA) was immobilized onto streptavidin beads, followed by incubation with extracts from E. coli MC4100 Δhns (ZJUTCBB0015) or BW25113 Δhns (ZJUTCBB0009) before SDS–PAGE electrophoresis. (B) A heatmap shows the number of peptides pulled down by Pcas DNA probe. (C) SDS–PAGE gel electrophoresis of nontargeted biocytin (negative control, NT) or biotin-Pcas DNA-binding proteins. NT or biotin-Pcas samples were digested in gel with trypsin, and corresponding peptides were analyzed by mass spectrometry. An arrow denotes the band corresponding to LrhA (34.593 kDa). An asterisk marks the higher-abundance band in the MC4100 Δhns sample (proteins listed in Supplementary Table S9). (D) Pull-down assay of physical interaction between Pcas and LrhA with a Flag-tag (Flag-LrhA). The biotinylated probe (biotin-Pcas DNA) was immobilized onto streptavidin beads and incubated with Flag-LrhA from cell extracts before (lane 3, input) and after (lane 2, Pcas) pull-down, followed by western blot assay with antibody against Flag-tag. Nontargeted biocytin (NT) was used as a negative control (lane 1). (E) (Upper) Effects of 10 predicted TFs and LeuO on the Pcas activity in E. coli BW25113 Δhns ΔleuO (EC73). (Lower) Effects of YdcI, SlyA, LrhA, and LeuO on the Pcas activity in E. coli BW25113 Δhns ΔleuO ΔlrhA (EC77). The expression of TFs was induced by adding 10 mM arabinose. The Pcas activity was evaluated by GFP/OD600 at 37°C, with the relative expression level of Δhns ΔleuO carrying empty vector pSU19 (Vector) as 1 arbitrary unit. All bars represent the mean, and error bars denote standard deviation. Individual biological replicates are shown (n = 4). Statistical significance was determined using one-way ANOVA with Dunnett’s multiple comparisons test. ****P < .0001; *P < .01 (F) Phylogenetic analysis of 9 HTH-type Pcas-binding TFs from 5–8 different bacteria species.
To check whether these Pcas-binding TFs regulate transcription of the cas operon, their effects on the activity of Pcas were evaluated with a previously described fluorescence reporter system [33] (Fig. 1E). To avoid the effect of the known strong regulators H-NS and LeuO, the 10 predicted TFs, and LeuO (as the positive control), were separately over-expressed in a Δhns ΔleuO mutant. SlyA promoted the activity of Pcas by 2.4-fold (Fig. 1E), in agreement with the previous report [57]. Eight TFs, which are phylogenetically distantly related, were identified to contain a helix-turn-helix (HTH) DNA-binding domain (DBD), similar to that of LeuO (Fig. 1F). The LTTR ArgP, known to act either as a transcriptional activator or a repressor depending on its binding with two effectors lysine and arginine [58], reduced the activity of Pcas by 70%, while LrhA increased the activity of Pcas by more than two-fold, compared with the Vector (Fig. 1E). Although YdcI and AaeR also belong to LTTRs and bind to Pcas, YdcI induced a 1.3-fold activation of Pcas in an lrhA-dependent manner, whereas AaeR had no effect. (Fig. 1E). Thus, we narrowed down to the regulators LrhA, SlyA, ArgP, and HexR below. To investigate their potential functions on regulating CRISPR interference, the conjugative plasmid pTc, which carries a DNA fragment with four PAM-containing spacers from pT [33], was transferred from donor (WM3064) to recipient (BW25113 Δhns ΔleuO mutant strains expressing these regulators). The clearance of the CRISPR-targeted plasmid (pTc) and the CRISPR-non-targeted plasmid (pNTc) during conjugation was examined. As shown in Supplementary Fig. S1E and F, expression of LrhA or LeuO decreased the conjugation frequencies of pTc but not pNTc after 24 h growth, exhibiting a higher CRISPR immunity than the control. In contrast, expression of ArgP did not significantly affect the clearance rates of both plasmids, with CRISPR immunity unaffected in the corresponding strain. Overexpression of HexR or SlyA reduced plasmid stability, thus the conjugation frequencies of both plasmids were not detected (Supplementary Fig. S1E). Therefore, we focused on LrhA in the following investigation.
Binding of LrhA on Pcas DNA is essential for its activation of Pcas
LrhA belongs to the LTTR family, one of the largest families of transcription factors in prokaryotes, whose members generally share a conserved monomeric structure [59, 60]. To elucidate how LrhA activates Pcas, the AlphaFold3 platform [40] was used to predict the structure of LrhA-Pcas DNA. The model with the highest confidence revealed that only the tetrameric form of LrhA, rather than monomeric or dimeric forms, adopts a compact globular conformation in complex with Pcas DNA (Fig. 2A). To validate these structural predictions, we performed MD simulations for the LrhA-Pcas complex, including 100 ns for equilibration and 500 ns for three parallel runs. RMSD analyses revealed that each trajectory stabilized after ~300 ns and remained stable thereafter with minor fluctuations (Supplementary Fig. S2A–D), supporting the stability of the LrhA tetramer–DNA complex. Gel filtration analysis revealed that LrhA alone eluted at 73.88 ml, corresponding to a dimer (calculated monomer mass: 35.8 kDa). In the presence of DNA, a fraction of LrhA formed larger complexes with LBS I (58 bp) and LBS II (59 bp), eluting at 64.72 ml and 63.90 ml, respectively. The estimated molecular weights of these complexes (∼168 and ∼181 kDa) are consistent with a tetramer of LrhA in complex with a single DNA molecule (Supplementary Fig. S2E–G), validating the structural predictions.
*LrhA binds to LBS II and LBS I on Pcas DNA. (A) Predicted model of the LrhA tetramer–DNA complex based on AlphaFold 3. Protein components and DNA including nontemplate- (nt-) strand and template- (t-) strand. are depicted using distinct colors, with proteins represented as circles and DNA as rectangles. Amino acid interaction maps showing amino acid pairs with interaction between LrhA–A and LrhA–D (B, C), LrhA–A and LrhA–B (D, E). Redundant and symmetric interactions are excluded. (F) Schematic illustration of the Pcas region for truncation analysis. LBS I and LBS II indicate two DNA-binding sites of LeuO located upstream and downstream of the TSS on Pcas.(G) Electrophoretic mobility shift assays (EMSA) of full-length or truncated Pcas DNA fragments by LrhA. The Pcas probes including full-length Pcas (Pcas FL), Pcas with LBS I deleted (Pcas del LBS I), Pcas with LBS II deleted (Pcas del LBS II), and Pcas with both LBS I and LBS II deleted (Pcas short probe) were generated. One nanomolar Pcas probes were separately incubated with a serial concentration of purified His–LrhA before EMSA assays with 5% native polyacrylamide gels (solid arrow for free Pcas probes, and braces for LrhA + probe complex). A 175 bp probe of lacZα fragment was used as the control (Ctrl probe, dashed arrow). (H) Schematic illustration of DNA binding sites required for activating Pcas by LrhA and LeuO. Transcriptional activities of full-length Pcas, Pcas del LBS I, Pcas del LBS II, and Pcas del LBS I/II were compared by using the Pcas-gfp reporter. (I) Effects of disruption of LBS I and LBS II on activation of Pcas by LrhA. The expression of LrhA or LeuO was induced by adding 30 mM arabinose. The Pcas activity was evaluated by GFP/OD600, with the relative expression level of BW25113 Δhns ΔleuO carrying the empty vector as 1 arbitrary unit. Bars are the means and error bars ± SD. Individual biological replicates are shown (n = 4). Statistical significance was determined using one-way ANOVA with Dunnett’s multiple comparisons test. ****P < .0001; **P < .01. (J) Competitive EMSA assay of Pcas DNA with LeuO and LrhA, either alone or with pre-bound LeuO and followed by the addition of LrhA, or pre-bound LrhA and followed by the addition of LeuO. (K) Effects of LrhA mutants on activation of Pcas in E. coli BW25113 Δhns ΔleuO (EC73). The expression of LrhA was induced by adding 30 mM arabinose. Bars are the means and error bars ± SD. Statistical significance was determined using one-way ANOVA with Dunnett’s multiple comparisons test. ****P < .0001; *P < .01.
Consistent with other LTTR family members, the LrhA tetramer exhibits a canonical dimer-of-dimers configuration formed via two interaction interfaces: the EBD-to-EBD (Effector-binding domain) (Fig. 2B and C) and DBD-to-DBD (Fig. 2D and E). The residue–residue contact map calculated from simulation trajectories is shown in Supplementary Fig. S3. Mutations at key interfacial residues (R79, Y217, D14, R17, F84, and N85) of severely impaired LrhA-dependent Pcas activation (Fig. 2K), validating structural predictions. Although LrhA and LeuO share a similar DBD (Supplementary Fig. S19A), their overall structures differ significantly. Each of the DBD dimers in the LrhA tetramer faces opposite sides with a parallel arrangement, and binds to two sites on Pcas, which are partially overlapped with two LeuO binding sites (LBS I and LBS II) within Pcas [29, 30]. The LrhA chains A and B interact with LBS II while chains C and D interact with LBS I (Fig. 2A and Supplementary Fig. S1A). Notably, this binding mode differs significantly from that of LeuO. While two DBD dimers of LrhA are oriented oppositely, those of LeuO face the same direction in the tetrameric complex (Supplementary Fig. S4A–C). This structural divergence originates from distinct angles between the linker helices and EBDs in the two proteins. Consequently, LeuO monomers adopt both compact and extended conformations, whereas all LrhA monomers exhibit a uniform conformation (Supplementary Fig. S4D). These structural differences result in distinct DNA-bending geometries. Both LrhA and LeuO induce modest bending of LBS I and LBS II, but their bending patterns differ: LrhA produces similar bending angles on LBS I (140.2°) and LBS II (145.9°), whereas LeuO displays a greater variation between LBS II (154.2°) and LBS I (131.7°) (Supplementary Fig. S19B).
The binding of LrhA to (LBS I and LBS II) within Pcas was checked by EMSA. DNA probes lacking LBS I and/or LBS II were generated (named “Pcas del LBS I,” “Pcas del LBS II,” and “Pcas short probe”) (Fig. 2F). As shown in Fig. 2G, 700 nM of His-LrhA was sufficient to retard the mobility of the full-length Pcas probe (Pcas FL). However, virtually no detectable retardation of the Pcas probe lacking LBS I was observed, even in the presence of 1400 nM LrhA, indicating that LBS I is essential for LrhA binding. In contrast, deleting LBS II only partially impaired LrhA binding with Pcas (Fig. 2G). These observations support that LrhA binds to both LBS I and LBS II.
To know whether LBS I and LBS II were essential to activate Pcas by LrhA and LeuO in vivo, we generated Pcas containing disrupted LBS I and/or LBS II (Fig. 2H), and compared in vivo activities of disrupted Pcas with that of Pcas FL in response to expression of LrhA and LeuO. Over-expressing LrhA and LeuO increased the transcriptional activity of Pcas FL by more than three- and eight-fold, respectively (Fig. 2I). In contrast, the activity of Pcas del LBS I in the BW25113 Δhns ΔleuO mutant expressing LrhA or LeuO was decreased to the background level or 3.88-fold higher than the background level, respectively (Fig. 2I). This observation indicates that LBS I is indispensable for activation of Pcas by LrhA, and essential for full stimulation of Pcas by LeuO. Interestingly, the activity of Pcas del LBS II was also remarkably reduced in the BW25113 Δhns ΔleuO mutant expressing either LrhA or LeuO (Fig. 2I), revealing that LBS II is required for activation of Pcas by both LrhA and LeuO. As expected, when both LBS I and LBS II were deleted, the expression of LrhA or LeuO no longer stimulated the expression of the impaired Pcas (Fig. 2I). Taken together, these data clearly show that the interactions of LrhA with both LBS I and LBS II are essential for activation of Pcas by LrhA in vivo.
To determine whether LrhA and LeuO influence each other when their binding sites overlap, a competitive EMSA was performed. Pre-bound LeuO on the Pcas probe promoted the subsequent cooperative binding of LrhA. Conversely, pre-bound LrhA also facilitated LeuO binding (Fig. 2J).
The HTH of LrhA–DBD is predicted to interact with the major groove of Pcas DNA, making direct contact with nucleotides. To characterize the binding interface, a contact map was generated to identify stable LrhA-DNA interactions. (Supplementary Fig. S5). LrhA residue-nucleotides pairs with interaction frequencies above 90% were selected for examining the hydrogen bonding network by GROMACS gmx hbond analysis [44] (Supplementary Fig. S6). Hydrogen bond counts between 0.8 and 1.6 (corresponding to one hydrogen bond) and between 1.6 and 2.0 (corresponding to two stable hydrogen bonds) were shown in Fig. 3A and B. Residues Gln39 (39Q) and Ser40 (40S) of LrhA–DBD form hydrogen bonds with DNA bases within LBS II and LBS I of the Pcas. These residues are highly conserved, underscoring their essential role in specific DNA recognition and transcriptional regulation (Supplementary Fig. S7). A nearly identical palindromic sequence within LBS II and LBS I suggests a conserved binding pattern. Similar sequences are present in PleuO, PnrdA, PenvR, and PacrE promoters, which are targets of LrhA identified via SELEX [61]. These sequences were analyzed using GLAM2 Suite [62] (Table. S3), generating a putative consensus DNA-binding motif for LrhA (Fig. 3C). To experimentally validate the LrhA-DNA interactions and functional relevance of this motif, mutations were introduced at LBS I (positions +40, +42 or +54, +55) or LBS II (positions −116, −114 or −102, −101) within Pcas (Fig. 3D). Mutation of these positions within LBS I significantly reduced the LrhA-dependent activation of Pcas in vivo (Fig. 3E), and disrupted LrhA binding to Pcas in EMSA (Fig. 3F). In contrast, analogous mutations in LBS II had no detectable effect, indicating that LrhA binding of LBS I is essential for transcriptional activation of Pcas. Calculation of binding free energies by gmx_MMPBSA [45] revealed that the mutation at +40, +42 decreased the interaction energy between the LrhA-CD dimer and the LBS I DNA from −97.4 kcal/mol [wild-type (WT)] to −68.0 kcal/mol (Supplementary Fig. S8A). These mutations also altered binding energy contributions of specific residues of LrhA–C (Supplementary Fig. S8B), potentially due to the loss of hydrogen bonding between LrhA–C and the mutated DNA bases (Supplementary Fig. S8C). Mutants at the key amino acids of LrhA implicated in Pcas binding (F28, R37, T38, Q39, S40, Q44, R48, R63, and K65) were constructed. Mutations at F28, R37, Q39, and S40 almost abolished activity of Pcas in vivo (Fig. 3G), and severely impaired DNA binding in vitro (Fig. 3H), while mutations at T38, Q44, and R63 impaired the ability of LrhA to activate Pcas. The results are consistent with the predictions from AlphaFold3 and MD simulation.
*Identification of LrhA binding sites on the promoter of the cas operon. Hydrogen bond formations between residues of LrhA and the LBS II DNA strands (A) or the LBS I strands (B). Dashed lines indicate hydrogen bonds formed between residues and nucleotides, interactions are shown with directionality, indicating donor-to-acceptor orientation, only interactions meeting defined frequency and stability thresholds are shown (see “Materials and methods” section). Nucleotide positions are numbered relative to the TSS. (C) Putative LrhA DNA-binding motif. Arrows indicate the palindromic sequence.Inverted triangles indicate the positions for mutation. (D) Schematic representation of mutations in the LBS II and LBS I site (highlighted in red). (E) Effects of mutations of LBS I and LBS II on activation of Pcas by LrhA in E. coli BW25113 Δhns ΔleuO (EC73). The expression of LrhA was induced by adding 30 mM arabinose. Bars are the means and error bars ± SD. Statistical significance was determined using one-way ANOVA with Dunnett’s multiple comparisons test. ****P < .0001; ***P < .001. (F) EMSA assays of WT or mutated Pcas DNA fragments by LrhA. (G) Effects of LrhA mutants on activation of Pcas in E. coli BW25113 Δhns ΔleuO (EC73). The expression of LrhA was induced by adding 30 mM arabinose. Bars are the means and error bars ± SD. Statistical significance was determined using one-way ANOVA with Dunnett’s multiple comparisons test. ****P < .0001; ***P < .001; **P < .01; P < .05. (H) EMSA assay of Pcas DNA by WT or mutated LrhA.
LrhA stabilizes RNAP-Pcas interactions and modulates Pcas DNA geometry
To investigate how LrhA influences transcription initiation at Pcas, we generated a structural model of the RNAP-Pcas DNA-LrhA complex using AlphaFold 3 [40]. The predicted arrangement closely matched the RPo-state σ^70^-holoenzyme structure (PDB: 7MKD), with an overall backbone RMSD of 1.176 Å between the two models (Fig. 4A). This high structural congruence indicates that the predicted complex adopts an RNAP promoter open complex (RPo)-like configuration and is structurally compatible with productive transcription initiation. MD simulations were performed to predict the stability of RNAP-Pcas DNA models in the presence or absence of LrhA. Binding free energies between RpoD and promoter DNA, computed using gmx_MMPBSA [45], were consistently higher in the presence of LrhA (Fig. 4B), whereas LrhA–free complexes showed reduced interaction strengths. Consistently, BACTH assays revealed a positive interaction between LrhA and RpoD (σ^70^) (Fig. 4C), suggesting LrhA stabilizes RNAP at Pcas through interactions with multiple RNAP subunits, thereby enhancing transcription initiation. This stabilizing effect is consistent with the universal, homeostatic mechanism described for transcription factors, in which both activators and repressors modulate promoter output primarily by stabilizing RNAP binding at promoters [63].
LrhA stabilizes RNAP-Pcas interactions and modulates Pcas DNA geometry. (A) Predicted structural model of the σ70-holoenzyme bound to Pcas and an LrhA tetramer. LrhA is shown as a gold-colored homotetramer positioned upstream of the TSS. The duplex DNA is rendered in royal blue. The RNAP core subunits (RpoB, RpoC, RpoA, and RpoZ) are displayed in light gray, and RpoD (σ70) is highlighted in tomato. (B) Comparison of ΔTOTAL binding energies between RpoD and promoter DNA in the presence versus absence of LrhA. Binding free energies were computed using gmx_MMPBSA [45] over the production intervals specified in Methods. Bars represent mean values across the analyzed trajectory windows. (C) BACTH assay for the interactions between LrhA and RpoD (σ70) (D) Time evolution of promoter DNA bending in RNAP-Pcas DNA complexes with and without LrhA. Bending angles were calculated for the −60 to +20 promoter segment using the three-block COM definition described in “Materials and methods” section.
Finally, we quantified promoter bending for the −60 to +20 region using the three-block COM definition [55]. Across all replicates, the LrhA-bound complexes exhibited a higher mean bending angle (156.6°) compared with the LrhA-free models (150.1°), with the difference becoming more pronounced during the latter half of the trajectories (Fig. 4D). These results indicate that LrhA modestly reshapes promoter DNA geometry while simultaneously strengthening RpoD–DNA interactions, providing a structural basis for its activation of CRISPR-associated transcription.
LrhA activates CRISPR interference against M13 phage infection
Having established LrhA as a regulator of Pcas, we sought to test whether it could play a role in regulating CRISPR interference in WT E. coli. The CRISPR-Cas system is reported to exhibit little immunity to some well-characterized phages (such as phage λ) in E. coli due to the silencing of endogenous chromosomal cas gene expression [64]. Overexpression of the cas genes and crRNA can lead to de-repression of the CRISPR-Cas system in some E. coli K12 strains [64, 65]. Before assessing the physiological effects of LrhA on CRISPR-Cas in a WT background, we evaluated the expression of lrhA and cas genes in eight E. coli strains. In one strain ER2738, expression levels of lrhA and cse1 were significantly higher than in other strains (Supplementary Fig. S9A). ER2738 contains an F-factor encoding pili that are required for M13 phage adsorption [66]. Thus, we assessed the role of LrhA in regulating CRISPR immunity against M13 in this strain. A spacer (g8 spacer) matching the gene 8 of M13 phage was integrated into the CRISPR I array of the WT ER2738 strain (WT-g8), which was expected to confer resistance to M13 phage infection (Fig. 5A). In the M13-targeting strain (with g8 spacer), inactivation of lrhA in ER2738-g8 (ΔlrhA-g8) reduced the phage titer by ∼10^6^-fold, while complementing lrhA in the ΔlrhA mutant (lrhA^c^-g8) fully restored its resistance to M13 phage (Fig. 5B and Supplementary Fig. S10A), demonstrating that l rhA expressed from the genome enhances immunity against M13 phage. Inactivation of cas3 abolished the CRISPR immunity in WT-g8 (Supplementary Fig. S10B), confirming that immunity against M13 phage is dependent on the CRISPR-Cas system. In the non-targeting strain, inactivation of lrhA did not affect M13 phage infection (Fig. 5C). ER2738-g8 (lrhA^+^) exhibited an ~18-fold higher level of CRISPR immunity against the CRISPR-targeted plasmid compared with the ER2738 ΔlrhA-g8 (lrhA^−^) strain, but did not enhance spacer acquisition (Supplementary Fig. S10C–E). The effect of LrhA on transcription of the cas operon in ER2738 was evaluated by quantifying the level of transcript of cse1 through qPCR. The expression of cse1 was reduced by two-fold by inactivation of lrhA (Fig. 5D). While complementing lrhA restored the transcription of the cas operon in the ΔlrhA mutant (Fig. 5D). The results demonstrate that LrhA activates transcription of the cas operon. We further explored regulation of lrhA during phage infection and plasmid transformation, and found that lrhA transcription was induced by M13 phage infection (Fig. 5E), but not by plasmid transformation (Supplementary Fig. S10F). Together, these results demonstrate that LrhA plays a crucial role in promoting the CRISPR interference against phage infection in a WT E. coli strain.
*LrhA activates CRISPR-Cas immunity against M13 phage. (A) Schematic illustration of the engineered CRISPR cassette carrying a g8 spacer in the genomic CRISPR I locus. In the M13-targeting cells, CRISPR I contains an additional g8 spacer, conferring cells with resistance to M13 phage infection. During infection, phage DNA enters the cell as a circular (+) single-stranded DNA, followed by the formation of double-stranded (ds) phage genome DNA (template), which is replicated in a rolling circle form to generate progeny (+) strand that is packaged in a mature phage. M13 phage DNA in the dsDNA form (i.e. template and rolling circle replication) can be recognized by g8 crRNA-guided Cascade complex. (B) Quantification of the phage titer (PFU/ml) at 2–12 h post-infection with M13 phage in ER2738 WT, ΔlrhA, or lrhAc cells with g8 spacer (M13-targeting strain: WT-g8, EC89; ΔlrhA-g8, EC90; and lrhAc-g8, EC91). (C) Quantification of the phage titer (PFU/ml) at 2–12 h post-infection with M13 phage in cells without g8 spacer (non-targeting strain: WT, ER2738; ΔlrhA, EC87; and lrhAc, EC88). Individual biological replicates are shown (n = 3). Statistical significance between WT-g8 and ΔlrhA-g8 at 2–12 h was determined using multiple t-tests. *P < .05; **P < .01; **P < .001. (D) qPCR analysis of cse1 in WT, ΔlrhA, and lrhAc strains. (E) qPCR analysis of lrhA in WT cells with or without M13 phage infection.
LrhA promotes clearance of the CRISPR-targeted plasmid
The role of LrhA in regulating the CRISPR immunity was evaluated in E. coli BW25113. Deletion of lrhA showed no effect on CRISPR immunity against pTc during conjugation (Supplementary Fig. S11A) and M13 phages in WT BW25113 strains with F factor (Supplementary Fig. S11B). Considering that H-NS represses the CRISPR-Cas system in E. coli K12 strains [31], it is possible that the effect of LrhA was masked by H-NS in BW25113 (Supplementary Fig. S11C). Thus, we examined the role of LrhA on cas gene expression and CRISPR-Cas immunity against plasmids (Fig. 6A) in Δhns mutant. Deleting lrhA decreased transcription of cse1 by 11% (relative to Δhns) (Fig. 6B). Further deleting leuO made LrhA exhibit a more pronounced effect on the activity of Pcas, decreasing transcription of cse1 by 26% (relative to ΔhnsΔleuO) (Fig. 6B). The CRISPR immunity against transferred plasmid was assessed by evaluating the clearance rate of CRISPR-targeted plasmid in lrhA^+^ and lrhA^−^ strains. The relative CRISPR immunity in the lrhA^+^ strain was ∼3.5-fold of that in the lrhA^−^strain with deletion of hns (Fig. 6C) and ∼9.8-fold of that in the lrhA^−^ strain with further deletion of leuO (Fig. 6C), demonstrating that LrhA activates the CRISPR-Cas system for clearing the CRISPR-targeted plasmid independent of H-NS and LeuO. Given that deleting leuO led LrhA to exhibit a stronger effect on promoting the clearance of CRISPR-targeted plasmid, promotion of the CRISPR immunity by LrhA could be partially masked by LeuO.
*LrhA activates both CRISPR-Cas interference and adaptation. (A) Schematic illustration for evaluating the CRISPR-Cas interference activity by plasmid loss assay. (B) Effects of genomic lrhA on the expression of cas genes (indicated by cse1) assessed by qPCR assay. cse1 mRNA level were compared in BW25113 WT, ΔlrhA (EC75), Δhns (EC71), Δhns ΔlrhA (EC76) strain, Δhns ΔleuO (EC73), and Δhns ΔleuO ΔlrhA (EC77) grown in LB medium at 37°C, with the relative expression level of BW25113 WT, as 1 arbitrary unit. Statistical significance was assessed using a one-way ANOVA. *P < .05. Individual biological replicates are shown (n = 4). (C, D) Relative CRISPR immunity assessed by plasmid loss assay in E. coli strains BW25113 Δhns (EC71), Δhns ΔlrhA (EC76), Δhns ΔleuO (EC73), and Δhns ΔleuO ΔlrhA (EC77) grown in antibiotic-free LB medium for 36 h. Statistical significance was assessed using a two-tailed Student’s t-test. Bars are the means and error bars ± SD. Individual biological replicates are shown (n = 5 for C and n = 3 for D). *P < .05; *P < .01. (E) Schematics of the spacer acquisition assay in panels (F) and (G). During the adaptation process, a new spacer (rectangle) can be inserted into the leader end of the CRISPR array. PCR was performed to amplify the leader-spacer region using the primers matching the leader and the second spacer, which were indicated as flagged arrows. (F) Spacer-acquisition assay by PCR amplification of BW25113 Δhns ΔleuO (+), BW25113 Δhns ΔleuO ΔlrhA (−) cells carrying pT or pNT plasmids, which were grown at 37°C or 30°C for 36 h. PCR products were electrophoresed on a 2% agarose gel and imaged. Parental (P) and expanded bands (E) are indicated on the right, while DNA marker (M) positions are displayed on the left. Gels are representative of two experiments yielding similar results. (G) Spacer-acquisition assay of Δhns ΔleuO (+) and Δhns ΔleuO ΔlrhA (-) and genomic lrhA complementation (c) stains, with their cas1–cas2 deletion (Δcas1–2, EC81, EC82, EC83, indicated by “-”) or its complementation (cas1–2 c, EC84, EC85, EC86, indicated by “c”). (H) Schematic illustration of the relationship between CRISPR interference and adaptation regulated by LrhA. In the presence of cas1–2, LrhA promotes the positive feedback circuit between CRISPR interference and adaptation (upper panel). In the absence of cas1–2, LrhA shows a weak effect on interference (lower panel).
Promoting primed adaptation by LrhA accelerates clearance of the CRISPR-targeted plasmid
We observed an apparent effect of LrhA on CRISPR immunity against transferred plasmid during long-term growth, but not transferring plasmid during conjugation. Given that weak interference often allows DNA to persist longer and promotes acquisition of new spacers that allow bacteria to mount more efficient CRISPR interference [12, 67, 68], we hypothesized that intermediate activation of the cas operon by moderate expression of LrhA might promote clearance of pT through better coordinating interference and primed adaptation. To test this hypothesis, we monitored CRISPR array expansion in response to pT and pNT in Δhns ΔleuO (+) and Δhns ΔleuO ΔlrhA (-) strains (Fig. 6E). Compared with that in the Δhns ΔleuO strain, fewer expansion events were observed in the Δhns ΔleuO ΔlrhA mutant carrying pT after 36 h of incubation at the 37°C (the body temperature) and 30°C (the environmental temperature), showing that LrhA is required for efficient adaptation (Fig. 6F and Supplementary Fig. S12A and B). Complementing lrhA on the genome of the Δhns ΔleuO ΔlrhA mutant (c) restored adaptation (Fig. 6G and Supplementary Fig. S12B). In contrast, CRISPR array expansion was not observed in the these strains carrying pNT (Fig. 6F). Sequencing analysis of the expansion bands from Δhns ΔleuO strains carrying pT revealed that all newly formed spacers (20/20) were derived from the CRISPR-targeted plasmid (Supplementary Fig. S12C), with a strong preference for an AAG PAM (16/20) during adaptation, and a pronounced bias in the DNA strand (15/20) matching the orientation of the CRISPR-targeting units (CR1) (Supplementary Table S4), in agreement with typical traits of primed CRISPR adaptation [12, 67]. Compared with other regions on the CRISPR-targeted plasmid, more protospacers were mapped to encoding genes (i.e. bla and dsRED) and the replication origin (Supplementary Fig. S12C).
To confirm that LrhA promoted clearance of pT through reinforcing the interplay between CRISPR-Cas interference and primed adaptation, we deleted both cas1 and cas2 (Δcas1–2) and examined the CRISPR interference in these strains through the plasmid loss assay. Deletion of cas1–cas2 abolished the CRISPR array expansion (Fig. 6G), and reduced the effect of LrhA on CRISPR immunity against plasmid (Fig. 6G), revealing that LrhA activates CRISPR immunity through promoting the positive feedback of interference and adaptation processes (Fig. 6H). Loss of CRISPR expansion from Δcas1–2 cells was restored by genomic complementation of cas1–cas2 (Fig. 6G), consistent with the critical role of cas1–cas2 in adaptation. Activation of CRISPR interference by LrhA was regained by complementation of cas1–cas2 (Supplementary Fig. S12D), confirming that Cas1–Cas2 was essential to LrhA-regulated adaptation-mediated interference.
Discussion
The type I-E CRISPR-Cas is widely distributed among E. coli strains based on analysis of collections of natural isolates [69] (Supplementary Fig. S16). It has also been reported to be functional in other Enterobacteriaceae (for example, Citrobacter and Klebsiella) [70]. In this study, we have addressed the important question of how the type I-E CRISPR-Cas system is activated in response to a transcriptional regulator in E. coli strains. By combining DNA pull-down and mass spectrometry, we identified a series of LTTRs as regulators of Pcas, especially LrhA, that enhance immunity against phage infections and facilitate a reciprocal interplay between interference and adaptation, thereby providing immunity against horizontally transferred plasmids in E. coli. According to sequences of E. coli natural isolates, hns, leuO, and lrhA were retrieved with high-confidence matches (coverage 99%–100%, 46 out of 48 strains) (Supplementary Fig. S16). LrhA is a conserved pleiotropic regulator in Enterobacteria (homologues named RovM in Yersinia, PigU in Serratia, HexA in Photorhabdus, PecT in Erwinia) [71, 72]. This conserved co-occurrence suggests the core regulatory framework is broadly conserved across Enterobacteriaceae. PigU was reported to regulate both type III and type I-F CRISPR-Cas systems in Serratia [72]. Our work, along with others, suggests that LrhA and its homologs may play a widespread regulatory role across Enterobacteriaceae. Bioinformatic analyses have revealed that some E. coli strains lack the type I-E CRISPR-Cas system [69, 73, 74]. In strains that do possess CRISPR-Cas system, H-NS-mediated silencing often inhibits its expression, thereby limiting adaptive immunity [29, 31, 75]. However, under certain conditions, H-NS-mediated silencing can be relieved. For example, phage-encoded proteins (MotB of bacteriophage T4 and gp4 of Pseudomonas lytic phage LUZ24) can abrogate the DNA-binding and repressive function of H-NS [76] and its family protein [77]. Additionally, in some strains with low levels of H-NS, H-NS-mediated silencing could be weak, thereby allowing LrhA to enhance CRISPR immunity. Our research indicates that levels of lrhA mRNA differ in various WT E. coli strains (Supplementary Fig. S9A). This could be relevant to H-NS-mediated silencing. In BW25113, both leuO and lrhA are strongly suppressed by H-NS, whereas in ER2738, lrhA is only weakly repressed by H-NS (Supplementary Fig. S17E). As a defence system, the CRISPR-Cas system can be triggered by phages through host sensors and signal transduction pathways [22, 78, 79]. Our findings indicate that lrhA expression is triggered by phage infection, rather than plasmid transformation, revealing that phage-specific induction of lrhA establishes a regulatory link between the host’s response and phage infection. Nevertheless, phage determinants that induce expression of lrhA remain to be explored. Given that levels of lrhA mRNA partially correlate with the expression of the cas operon (Supplementary Fig. S9B), we propose that LrhA plays critical roles in tuning the CRISPR immunity within E. coli populations, thereby balancing the fitness costs while maintaining effective defense against foreign genetic elements. Collectively, our study not only provides new insights into the dynamic regulation of the CRISPR-Cas system against exogenous DNA under different scenarios, adding a new layer of regulatory complexity in bacterial defense, but also introduces a functional approach to screen regulators that directly control bacterial transcription.
LTTRs constitute one of the largest transcription factor families, regulating diverse aspects of bacterial life. Our screening identified a new CRISPR-Cas regulator LrhA, which share similarities in DNA-binding with LeuO, a key activator of the type I-E CRISPR-Cas system in E. coli and S. Typhi [29, 30, 80]. The simulated structure of the RNAP-Pcas DNA–LrhA complex support a “DNA looping” mechanism for LrhA-mediated activation of σ^70^-dependent transcription: by binding to two sites on Pcas, LrhA facilitates the recruitment and assembly of the RNAP-σ^70^ holoenzyme into a transcription-competent open complex. A recent work demonstrates that NtcA and NtcB, two different LTTRs, cooperatively activate of σ^70^-dependent transcription in bacteria involves a “DNA looping” mechanism [81]. Our work, along with others, suggests that “DNA looping” could be a general mechanism that is employed by LTTRs to enhance promoter activity, offering new insights into how LTTRs regulate gene transcription in a broader context.
Early reports suggested that primed adaptation was triggered by suboptimal Cascade-protospacer interactions that could lead to weak interference [12, 82, 83], whereas others indicated that optimal Cascade-protospacer interactions also led to both interference and primed adaptation, which was fueled by interference [14, 68, 84, 85]. Evidence remains lacking for intracellular regulation of CRISPR immunity through coordination of interference and adaptation. In this study, we found that LrhA moderately stimulated transcription of cas genes and promoted the clearance of CRISPR-targeted plasmid. Adaptation occurred more frequently in the lrhA^+^ E. coli strains harboring the CRISPR-targeted plasmid (Fig. 6F and Supplementary Fig. S12B), revealing that LrhA can promote interference-driven spacer acquisition. Given that inactivation of cas1 and cas2 remarkably reduced LrhA-stimulated CRISPR immunity (Fig. 6D), we propose that moderate activation of transcription of cas genes by LrhA enables the positive feedback circuit between CRISPR interference and adaptation: cleavage of the CRISPR-targeted plasmid produces more short DNA fragments for accelerating primed adaptation, which in turn enhances clearance of the CRISPR-targeted plasmid. In addition to LrhA, our work shows that LeuO can also promote adaptation (data not shown), but the effect on clearance of the CRISPR-targeted plasmid is not dependent on cas1–2 (Fig. 6D), indicating that strong activation of transcription of cas genes by LeuO was unlikely to trigger the adaptation-interference circuit. Analysis of the acquired spacers revealed an unequal distribution (Supplementary Fig. S12C), suggesting that certain sequences could be preferentially integrated as new spacers.
The transcriptional regulators of the type I-E CRISPR-Cas system have not been screened thoroughly. In a previous study of identification of the potential host factors involved in adaptation, Yoganand et al. employed the CRISPR/dCas9-mediated immunoprecipitation (IP) to identify host factors that bound the CRISPR leader region [86]. Recently, a CRISPRi-based screen has identified SspA, a host factor required for CRISPR adaptation in the type I-E system of E. coli [87]. By using the DNA pull-down assay with a DNA probe containing Pcas as the bait, we screened out a number of host factors that associated with Pcas from the cell lysate. Although the pull-down assay successfully identified known regulators such as SlyA and StpA, its sensitivity is constrained by protein abundance, a common limitation of such affinity-based approaches. For instance, LeuO was not detected, likely owing to its extremely low cellular abundance (~11 molecules per cell), which is substantially lower than that of LrhA (174–962 molecules per cell) [88]. Particularly, both our in vivo and in vitro data reveal that the pleiotropic LTTR LrhA activate the cas operon by binding to the LeuO binding sites in Pcas. LTTR family proteins seem to extensively regulate diverse types of CRISPR-Cas systems in different bacteria. LrhA (named PigU in Serratia) was reported to suppress the type III and type I-F but not the type I-E CRISRP-Cas system [72], indicating that this regulator plays differential roles in different bacteria. In Salmonella enterica, transcription of the CRISPR-Cas system was by two LTTRs with opposing functions (i.e. LeuO and LRP) [89]. LeuO is normally considered as an antagonist of H-NS, and activates gene transcription by relieving transcriptional suppression by H-NS [90, 91]. We found that overexpressed LrhA and LeuO promoted the activity of Pcas in the Δhns ΔleuO mutants (Fig. 1E), indicating that they both regulate the cas operon independent of H-NS.
Although both LrhA and LeuO bind Pcas, the two regulators show different binding site dependencies at Pcas. Their impacts on regulation of the type I-E CRISPR-Cas system and other physiological functions are context dependent. LrhA strongly stimulated the CRISPR-Cas activity against M13 phage in the WT E. coli ER2738 (Fig. 5B), demonstrating that LrhA is capable of playing a dominant role in regulating CRISPR-Cas activity in this strain upon phage infection. Activation modes of cas genes by LeuO and LrhA are also different. Although LeuO stimulated stronger transcription of cas genes, LrhA stimulated higher adaptive immunity in clearing CRISPR-targeted plasmid through better-coordinating interference and adaptation. Alphafold3 prediction and MD simulations support the stable formation of the tetrameric LrhA-Pcas complex, indicating that LrhA activates Pcas through a noncanonical mechanism involving binding to a downstream site. While downstream activation has been well-documented in eukaryotic and viral systems, such as the immunoglobulin κ promoter [92] and HIV promoter [93], where transcription factors like Sp1 mediate DNA looping by binding distal sites [94], this strategy is rare in bacteria. Nonetheless, a few bacterial regulators, GcrA [95], LadA [96], and IHF [97], have been reported to activate transcription by binding downstream of the core promoter. We speculate that these factors may activate transcription through a mechanism analogous to that of eukaryotic regulators, by promoting DNA bending or looping to facilitate the recruitment of RNAP or co-activators. Together with previous studies, our findings support the existence of an unconventional mode of transcriptional activation in bacteria and expand the functional repertoire of downstream regulatory elements.
Apart from regulating the CRISPR-Cas system, LrhA and LeuO showed remarkable differences in modes of transcriptional activation and controlling other physiological functions. The structural divergence between LrhA and LeuO results in distinct DNA bending patterns (Supplementary Fig. S19A and B), thereby leading to distinct regulatory outcomes at downstream target genes. The flagellar flhDC operon, which can be inhibited by LrhA, but not LeuO (Supplementary Fig. S13). Phylogenetic analysis revealed divergent evolution of LrhA and LeuO, which could belong to distinct subfamilies and separately evolve in bacteria (Supplementary Figs S14 and S15). Moreover, expression of LrhA and LeuO is induced through different mechanisms. The expression of LeuO can be activated by the heterodimer BglJ-RcsB [98], Cyanase regulator CynR [71], and antibiotic responses regulator YdcI [71], while the expression of LrhA is stimulated by GadE and HdeD, both components of the glutamic acid-dependent acid resistance system [99], as well as by the phosphatase RcsD of the Rcs signal transduction system. The small RNAs UhpU, ArcZ, and RprA interact with the 5′ UTR of lrhA and inhibit LrhA synthesis. We attempted to explore the upstream regulators of LrhA-Pcas axis, and found that ArcZ reduced the expression of lrhA and cas operon in BW25113 Δhns ΔleuO, but it did not affect their expression in ER2738. (Supplementary Fig. S18 B, D, F and H). This reveals that activation of Pcas by inducing expression of LrhA varies among E. coli strains.
In conclusion, we identify the LTTR LrhA as a crucial activator of the type I-E CRISPR-Cas system in E. coli. By binding to two distinct sites within the cas promoter, LrhA enhances cas gene expression and promotes CRISPR-mediated immunity against phage infection. Additionally, LrhA reinforces a positive feedback loop between interference and adaptation, thereby facilitating efficient plasmid clearance. These findings uncover a previously unrecognized mode of transcriptional regulation and deepen our understanding of how CRISPR-Cas activity is modulated in bacteria. Future studies examining how LrhA is integrated into global regulatory networks and responds to environmental signals will further illuminate the complexity and plasticity of CRISPR-Cas regulation in prokaryotic adaptive immunity.
Resource availability
Materials availability
All materials generated for this study are available upon request.
Supplementary Material
gkag204_Supplemental_File
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Mayo-Muñoz D, Pinilla-Redondo R, Birkholz N et al. A host of armor: prokaryotic immune strategies against mobile genetic elements. Cell Rep. 2023;42:112672. 10.1016/j.celrep.2023.112672.37347666 · doi ↗ · pubmed ↗
- 2Makarova KS, Wolf YI, Iranzo J et al. Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants. Nat Rev Micro. 2020;18:67–83. 10.1038/s 41579-019-0299-x.PMC 890552531857715 · doi ↗ · pubmed ↗
- 3Sasnauskas G, Siksnys V. CRISPR adaptation from a structural perspective. Curr Opin Microbiol. 2020;65:17–25.10.1016/j.sbi.2020.05.01532570107 · doi ↗ · pubmed ↗
- 4Haurwitz RE, Jinek M, Wiedenheft B et al. Sequence- and structure-specific RNA processing by a CRISPR endonuclease. Science. 2010;329:1355–8. 10.1126/science.1192272.20829488 PMC 3133607 · doi ↗ · pubmed ↗
- 5Xiao Y, Luo M, Hayes RP et al. Structure basis for directional R-loop formation and substrate handover mechanisms in type I CRISPR-Cas system. Cell. 2017;170:48–60.e 11. 10.1016/j.cell.2017.06.012.28666122 PMC 5841471 · doi ↗ · pubmed ↗
- 6Baumann K . Genome editing: c RISPR-Cas becoming more human. Nat Rev Mol Cell Biol. 2017;18:591. 10.1038/nrm.2017.84.28811667 · doi ↗ · pubmed ↗
- 7Xun G, Zhu Z, Singh N et al. Harnessing noncanonical cr RNA for highly efficient genome editing. Nat Commun. 2024;15:3823. 10.1038/s 41467-024-48012-x.38714643 PMC 11076584 · doi ↗ · pubmed ↗
- 8Gencay YE, Jasinskytė D, Robert C et al. Engineered phage with antibacterial CRISPR-Cas selectively reduce E. coli burden in mice. Nat Biotechnol. 2024;42:265–74. 10.1038/s 41587-023-01759-y.37142704 PMC 10869271 · doi ↗ · pubmed ↗
