A historical sequence deletion in a commonly used Bacillus subtilis chromosome integration vector generates undetected loss-of-function mutations
K. Julia Dierksheide, Gene-Wei Li

TL;DR
A 227-bp deletion in a commonly used Bacillus subtilis vector causes unintended gene mutations that may go unnoticed in experiments.
Contribution
Identifies a historical deletion in a vector that leads to undetected loss-of-function mutations in downstream genes.
Findings
A 227-bp deletion in the amyE integration vector causes unintended recombination in ~10% of colonies.
The deletion leads to a truncation of the ldh gene, which may affect fermentative metabolism undetected.
Both correct and incorrect recombinations test positive for amyE disruption, potentially confounding experiments.
Abstract
Since the 1980s, chromosome-integration vectors have been used as a core method of engineering Bacillus subtilis. One of the most frequently used vector backbones contains chromosomally derived regions that direct homologous recombination into the amyE locus. Here, we report a gap in the homology region inherited from the original amyE integration vector, leading to erroneous recombination in a subset of transformants and a loss-of-function mutation in the downstream gene. Internal to the homology arm that spans the 3′ portion of amyE and the downstream gene ldh, an unintentional 227 bp deletion generates two crossover events. The major event yields the intended genotype, but the minor event, occurring in ~10 % of colonies, results in a truncation of ldh, which encodes lactate dehydrogenase. Although both types of colonies test positive for amyE disruption by starch plating, the…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig. 1- —http://dx.doi.org/10.13039/100000001 National Science Foundation
- —http://dx.doi.org/10.13039/100000002 National Institutes of Health
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacterial Genetics and Biotechnology · Bacteriophages and microbial interactions · RNA and protein synthesis mechanisms
Main text
The model Gram-positive bacterium Bacillus subtilis is widely used for strain engineering due to its natural competence and efficient homologous recombination system [12]. Synthetic DNA is commonly introduced into specific loci of the genome via homology-containing integration vectors that can be constructed and manipulated as plasmids in Escherichia coli (Fig. 1a). One of the first genomic loci developed for integration vectors is at the gene amyE, which encodes α-amylase, a protein involved in starch degradation [34]. Successful integration leads to disruption of amyE, which can be easily screened for using an iodine stain that changes coloration upon binding to starch (‘starch test’) [2]. The original amyE double-crossover integration vector pBGtrp and its derivatives, such as pDR111 and pDG1661 [56], have enabled studies on many aspects of microbiology, ranging from gene regulation to cell division [710]. They have also been central to the development of synthetic biology toolkits for * B. subtilis* [1115].
Double-crossover events at amyE. (a) Schematic of an amyE integration vector (top) designed to direct integration of the insert (yellow) into the genome as shown in the transformant genome (bottom). On the integration vector, the insert is flanked by two homology arms, amyE-front and amyE-back (green). (b) Schematic of the missing homology region. In the B. subtilis genome, amyE is followed by the ldh-lctP operon (top). In pBGtrp and its derivatives, the annotated amyE-back region is followed by a 153 bp fragment of ldh, while missing the intervening 227 bp sequence (bottom). (c) The two possible double-crossover events. In both cases, crossover occurs as expected at the upstream amyE-front region, but the missing genome sequence in the plasmid allows for two possible recombination events at the downstream amyE-back region. The minor event results in loss of 227 bp of genomic sequence containing the ribosome binding site and the first 215 nucleotides of ldh.
However, we found that the homology regions in these commonly used amyE integration vectors are inconsistent with the genome sequence [16]. In the sequence of pDR111, the annotated *amyE-*back homology region is followed by an additional 153 bp sequence derived from a region of the genome 227 bp downstream of *amyE-*back (Fig. 1b; Supplementary material). The resulting extended homology region includes a gap that belongs to the downstream ldh gene and its ribosome binding site. Due to this discontinuity in the homology region, in addition to the expected crossover at *amyE-*back, crossover can occur at the 153 bp region on the plasmid, disrupting ldh, a gene that codes for lactate dehydrogenase (LDH) [17] (Fig. 1c). By colony PCR, we found that four of the 36 colonies tested after transformation with a derivative of pDR111 were missing the 227 bp region, indicating that the secondary crossover event occurs in a substantial proportion of transformants (Table S1, available in the online version of this article).
The discontinuous *amyE-*back homology region in pDR111 was inherited from pBGtrp, the original amyE double crossover integration vector developed in 1986 [3561819]. The pBGtrp homology arms were generated from subclones of the B. subtilis amyE gene that were used to sequence the gene in 1983. We found that the corresponding sequence deposited in GenBank is missing the same 227 bp, indicating that this region was likely lost in the process of preparing amyE for sequencing in E. coli [4]. In addition to pDR111, many amyE double crossover integration vectors developed over the past 40 years, including pDG1661, likely have inherited the same discontinuous homology arms from pBGtrp and its derivative vectors.
To facilitate correction of this error in future work, we constructed modified plasmids of pDR110 and pDR111 where the 153 bp region downstream of *amyE-*back has been removed. The removal of the ldh homology region did not substantially impact transformation efficiency, and all colonies tested (18 of 18) integrated at amyE as expected for both plasmids. These plasmids are available on AddGene (www.addgene.org) as pGL003 (modified pDR110) and pGL004 (modified pDR111).
Historically, a single B. subtilis colony that tests positive by the starch test is carried forward after transformation for subsequent experiments. Our results suggest that, across all strains constructed with pBGtrp and its derivatives, ten percent of the strains may be missing the ribosome binding site and a major portion of LDH. Given LDH’s role in fermentative metabolism and anaerobic growth [20], an undetected crossover in ldh may have influenced the results of previous experiments performed in these conditions. Furthermore, even in aerobic growth, LDH plays a role in re-utilizing lactate that is excreted as a by-product of overflow metabolism [21]. During aerobic growth in LB, addition of supplemental glucose induces ldh expression, indicating that loss of LDH function may also affect experiments performed in the presence of oxygen [22].
This discrepancy can also influence studies with large-scale libraries of strains – whether pooled or arrayed – at the amyE site. Libraries of B. subtilis cells with pooled CRISPRi, overexpression, or reporter variants are powerful tools for discovery when coupled to modern high-throughput assays. When generating a library of B. subtilis variants, all cells that carry the intended antibiotic resistance cassette are carried forward from one or multiple transformation reactions. If the current, discontinuous amyE homology region is used, each transformed variant will integrate at amyE through one of the two possible crossover events (Fig. 1c). These distinct crossover events are challenging to distinguish in high-throughput and introduce additional heterogeneity that could confound the results. Therefore, to ensure properly controlled experiments, especially in the context of fermentative B. subtilis studies, it will be important to correct the integration arms in future work.
supplementary material
10.1099/mic.0.001455Uncited Table S1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Harwood CR Bacillus subtilis and its relatives: molecular biological and industrial workhorses Trends Biotechnol 19921024725610.1016/0167-7799(92)90233-l 1368322 · doi ↗ · pubmed ↗
- 2Wozniak KJ Simmons LA Genome editing methods for Bacillus subtilis Methods Mol Biol Clifton NJ 2022247915917410.1007/978-1-0716-2233-9PMC 951919435583738 · doi ↗ · pubmed ↗
- 3Shimotsu H Henner DJ Construction of a single-copy integration vector and its use in analysis of regulation of the trp operon of Bacillus subtilis Gene 198643859410.1016/0378-1119(86)90011-93019840 · doi ↗ · pubmed ↗
- 4Yang M Galizzi A Henner D Nucleotide sequence of the amylase gene from Bacillus subtilis Nucleic Acids Res 19831123724910.1093/nar/11.2.2376186986 PMC 325711 · doi ↗ · pubmed ↗
- 5Britton RA Eichenberger P Gonzalez-Pastor JE Fawcett P Monson R et al Genome-wide analysis of the stationary-phase sigma factor (sigma-H) regulon of Bacillus subtilis J Bacteriol 20021844881489010.1128/JB.184.17.4881-4890.200212169614 PMC 135291 · doi ↗ · pubmed ↗
- 6Guérout-Fleury AM Frandsen N Stragier P Plasmids for ectopic integration in Bacillus subtilis Gene 1996180576110.1016/s 0378-1119(96)00404-08973347 · doi ↗ · pubmed ↗
- 7Kuhlmann NJ Chien P Selective adaptor dependent protein degradation in bacteria Curr Opin Microbiol 20173611812710.1016/j.mib.2017.03.01328458096 PMC 5534377 · doi ↗ · pubmed ↗
- 8Kalamara M Spacapan M Mandic-Mulec I Stanley-Wall NR Social behaviours by Bacillus subtilis: quorum sensing, kin discrimination and beyond Mol Microbiol 201811086387810.1111/mmi.1412730218468 PMC 6334282 · doi ↗ · pubmed ↗
