Auto-MuRCiS: a streamlined software package for analysis of multiplex, randomized CRISPR interference sequencing data

Kevin S. Myers; Michael Place; Audrey P. Gasch; Daniel R. Noguera; Timothy J. Donohue

PMC · DOI:10.1128/mra.00356-24·July 31, 2024

Auto-MuRCiS: a streamlined software package for analysis of multiplex, randomized CRISPR interference sequencing data

Kevin S. Myers, Michael Place, Audrey P. Gasch, Daniel R. Noguera, Timothy J. Donohue

PDF

Open Access

TL;DR

Auto-MuRCiS is a new tool that makes analyzing complex CRISPR data faster and easier for researchers.

Contribution

Auto-MuRCiS introduces a streamlined, Docker-based analysis pipeline for MuRCiS data.

Findings

01

Auto-MuRCiS improves the speed and usability of MuRCiS data analysis.

02

The tool enables rapid identification of gene knockouts affecting microbial processes.

Abstract

Multiplex, randomized CRISPR interference sequencing (MuRCiS) allows for the simultaneous identification of multiple gene knockouts that together influence microbial processes. Here, we report on an updated analysis tool called Auto-MuRCiS that utilizes Docker to make the analysis of these data rapid and more user-friendly.

Figures1

Click any figure to enlarge with its caption.

Fig 1 — The Auto-MuRCiS pipeline and output. (A) Overview of MuRCiS procedure. Random combinations of CRISPRi spacers that target specific genes (colored squares) are separated by repeats (gray diamonds) on a plasmid. The plasmid is transformed into a microbial library and grown in defined conditions, and the plasmids in the input and output are isolated and sequenced using long-read sequencing. (B) A flow chart illustrating the Auto-MuRCiS pipeline. The pipeline is run in a Docker container and the single Bash script copies files into and out of the Docker container. The input files are shown at the top with green arrows, and the general steps of the pipeline are in orange boxes. The output files are listed with purple arrows, and specific files used in the steps of the pipeline are listed alongside the black arrows. All analyses can be run from a single script (murcs_script.py) if Docker is not available. (C and D) Box plots (C) and violin plots (D) for all sequenced reads. (E and F) Box plots (E) and violin plots (F) for reads with matches to the repeat sequence, note the smaller distribution compared to C and D plots. (G) Correlation plot showing pairwise instances of all possible pairs of spacers. Squares are colored based on number of times each spacer pair is identified. (H) Chord diagram showing a number of times each spacer pair appears together in a single read, with colors defined by the user in the Color File.

Keywords

CRISPRidockerpythonpipelinecomputation

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCRISPR and Genetic Engineering · Genomics and Phylogenetic Studies · Single-cell and spatial transcriptomics

Full text

ANNOUNCEMENT

CRISPR interference (CRISPRi) can be used to inhibit or reduce the expression of specific genes, making it ideal to study gene function and its influence on phenotype (1). Multiplex, randomized CRISPRi sequencing (MuRCiS) can silence multiple, random sets of genes using defined and unique spacers in a single experiment (2). The MuRCiS procedure can be used to interrogate combinatorial effects of multi-gene inhibition in microbes, allowing researchers to study the impact of knocking down multiple genes (Fig. 1A) (2). However, analysis of these data as reported previously required comprehensive knowledge of bioinformatics and running many separate scripts, limiting its scope and utility. Here, we present Auto-MuRCiS, an open-source analysis pipeline to reproducibly process and analyze data from MuRCiS experiments (Fig. 1B).

The Auto-MuRCiS pipeline and output. (A) Overview of MuRCiS procedure. Random combinations of CRISPRi spacers that target specific genes (colored squares) are separated by repeats (gray diamonds) on a plasmid. The plasmid is transformed into a microbial library and grown in defined conditions, and the plasmids in the input and output are isolated and sequenced using long-read sequencing. (B) A flow chart illustrating the Auto-MuRCiS pipeline. The pipeline is run in a Docker container and the single Bash script copies files into and out of the Docker container. The input files are shown at the top with green arrows, and the general steps of the pipeline are in orange boxes. The output files are listed with purple arrows, and specific files used in the steps of the pipeline are listed alongside the black arrows. All analyses can be run from a single script (murcs_script.py) if Docker is not available. (C and D) Box plots (C) and violin plots (D) for all sequenced reads. (E and F) Box plots (E) and violin plots (F) for reads with matches to the repeat sequence, note the smaller distribution compared to C and D plots. (G) Correlation plot showing pairwise instances of all possible pairs of spacers. Squares are colored based on number of times each spacer pair is identified. (H) Chord diagram showing a number of times each spacer pair appears together in a single read, with colors defined by the user in the Color File.

Auto-MuRCiS utilizes a Docker image that maintains a consistent and reproducible environment in which to run the associated scripts (3). A single Bash script command is used and requires only three inputs to run using a single line of code: (i) a text file with the Binary Alignment Map (BAM) sequencing files to process; (ii) a two-column, tab delimited, text file indicating the sequence of the repeat and the various spacer sequences used in the experiment where the first column indicates the name of the sequence (“repeat” or the spacer ID) and the second column indicates the corresponding 5′ to 3′ DNA sequence; (iii) a two column, comma delimited, text file indicating the preferred colors to use for each gene in the Chord diagram plots generated by Auto-MuRCiS. Example input files can be found in the Example directory on GitHub. Auto-MuRCiS is designed to work with CCS BAM files from Pacific Biosciences (4). Modification would allow it to utilize other long-read sequencing methods, such as Oxford Nanopore Technologies (5). On a standard desktop or laptop computer (e.g., Intel Core i7 with 32 GB RAM) Auto-MuRCiS can process an experiment in ~30 minutes.

Auto-MuRCiS processes the data to report the lengths of reads with and without repeat sequences to visualize the differences as a control. The results are both written to text files and plotted as box and violin plots (Fig. 1C through F). The number of exact matches of spacer sequences provided for each gene is recorded for reads containing repeat sequences. Auto-MuRCiS will export the gene/spacer count values from each experiment to text files. Auto-MuRCiS will report on the pairwise distribution of events, which is how often each pair of spacers appear together in a read, regardless of the other spacers present in the read. Auto-MuRCiS will produce two publication-ready graphics: a correlation plot and a Chord diagram (Fig. 1G and H). The correlation plot can be used to gain an understanding of the overall distribution of the spacer pairs, while the Chord diagram can provide an easy way to view which pairs are most common in the data (using the user-provided color information). All output data are contained in a single directory that is copied out of the Docker container. Auto-MuRCiS will allow researchers to more easily adopt the powerful MuRCiS technology and address widespread biological questions.

Bibliography5

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Bock C, Datlinger P, Chardon F, Coelho MA, Dong MB, Lawson KA, Lu T, Maroc L, Norman TM, Song B, Stanley G, Chen S, Garnett M, Li W, Moffat J, Qi LS, Shapiro RS, Shendure J, Weissman JS, Zhuang X. 2022. High-content CRISPR screening. Nat Rev Methods Primers 2:9. doi:10.1038/s 43586-022-00098-737214176 PMC 10200264 · doi ↗ · pubmed ↗
2Ellis NA, Myers KS, Tung J, Davidson Ward A, Johnston K, Bonnington KE, Donohue TJ, Machner MP. 2023. A randomized multiplex CRISP Ri-seq approach for the identification of critical combinations of genes. Elife 12:RP 86903. doi:10.7554/e Life.8690338095310 PMC 10721215 · doi ↗ · pubmed ↗
3Merkel D. 2014. Docker: lightweight Linux containers for consistent development and deployment. Linux J Article 2
4Wenger AM, Peluso P, Rowell WJ, Chang P-C, Hall RJ, Concepcion GT, Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, et al.. 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 37:1155–1162. doi:10.1038/s 41587-019-0217-931406327 PMC 6776680 · doi ↗ · pubmed ↗
5Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, Armstrong J, Tigyi K, Maurer N, Koren S, et al.. 2020. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol 38:1044–1053. doi:10.1038/s 41587-020-0503-632686750 PMC 7483855 · doi ↗ · pubmed ↗