Genome Sequence of Arthrobacter globiformis B-2979 Phage Raphaella
Hannah Alapati, Adam Parks, Tyler Hildebrand, Joshua Leazer, Kateryn Rodriguez, John Patton

TL;DR
This paper describes the genome sequence of a soil-isolated bacteriophage called Raphaella, which infects the bacterium Arthrobacter globiformis.
Contribution
The study provides a detailed genomic analysis of a new bacteriophage in the Actinobacteriophages AY cluster.
Findings
Raphaella's genome is 51,692 base pairs long with a GC content of 62.6%.
The phage contains 96 putative protein-encoding genes and one tRNA gene.
It belongs to the AY cluster of Actinobacteriophages based on gene content similarity.
Abstract
Bacteriophage Raphaella was isolated from a soil sample collected in Springfield, MO using Arthrobacter globiformis B2979-SEA . Raphaella has a genome of 51692 base pairs with a GC content of 62.6%, 96 putative protein encoding genes and one tRNA. It has been placed in the AY cluster of Actinobacteriophages based on gene content similarity.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis · Machine Learning in Bioinformatics
Description
As antibiotic-resistant bacterial infections continue to rise, bacteriophages are being developed as an alternative therapeutic (Sulakvelidze et al., 2001, Strathdee and Patterson, 2019). In support of this development, the discovery and genetic characterization of novel phages is invaluable. Here, we report on the novel phage, Raphaella, which was isolated in September 2023, from a soil sample collected at Valley Water Mill Park in Springfield, MO (GPS: 37.26409 N, 93.24684 W). The soil sample was wet and contained small roots. This sample was suspended in peptone-yeast calcium (PYCa) liquid media, and the suspension was then centrifuged (2,000 x g, 10 min). The supernatant was filtered (0.2 micron pore size) before the filtrate was inoculated with *Arthrobacter globiformis * B2979-SEA. Following 3 days of incubation at 30˚C with shaking, an aliquot of the resulting culture was filtered. The filtrate plated in top agar with A. globiformis , giving rise to plaques of Raphaella after incubation of plates at 30˚C for 3 days. Raphaella was purified through three rounds of plating (Zorawik et al., 2024). Raphaella forms clear plaque with a diameter of 0.8 +/- 0.24 mm(n=5) (Zorawik et al., 2024) (Figure 1). A lysate was prepared and used for imaging virion particles by transmission electron microscopy using negative staining (1% uranyl acetate), revealing a capsid 59.2 +/- 1.9 nm (n=5) wide and the length of the tail was 210.7 +/- 15.8 nm (n=5) (Figure 1).
DNA of Raphaella was extracted from the lysate utilizing the Promega Wizard DNA clean-up kit, then sequenced on an Illumina MiSeq with v3 reagents after preparation with a NEB Ultra II Library Kit, which yielded 618710, 150 bp reads which constituted approximately 1701-fold coverage. These reads were assembled using Newbler v2.9 into a 51692 bp genome with 62.6% GC content, with 3' single-stranded genome termini determined using Consed v29 (Russell, 2018, Gordon and Green, 2013).
Raphaella's genome was automatically annotated using DNA Master v5.23.6 (Pope and Jacobs-Sera, 2018), embedded with Genemark v2.5 (Besemer and Borodovsky, 2005) and Glimmer v3.02 (Delcher et al., 2007). Translational starts were determined manually using the coding potential predicted in GeneMark (Besemer and Borodovsky, 2005) and then refined by comparison with similar genes using Starterator v578 (http://phages.wustl.edu/starterator/) and Phamerator v 578 (Cresawn et al., 2011). Putative functions were assigned to the genes using PECAAN (discover.kbrinsgd.org) and in the embedded BLASTp (Altschul et al., 1990) searches against the NCBI protein database and the actinobacteriophage database (Russell and Hatfull, 2016) as well as HHPred using PDB_mmCIF70, SCOPe70, Pfam-A, NCBI_Concerved_Domains (CD) databases (Söding et al., 2005). Utilizing ARAGORN v1.2.41 (Laslett and Canback, 2004) and tRNA scan v2.0 (Lowe and Eddy, 1997), a single tRNA which coded for the amino acid glycine was identified. Nine potential membrane proteins were found using DeepTMHMM v1.0 (Jeppe et al., 2022). All software were used with default setting. The annotation process revealed a total of 96 genes, 39 of which could be assigned putative functions. Based on gene content similarity of over 35% to phages in the Actinobacteriophage database, phagesDB, Raphaella was assigned to the AY cluster (Russell and Hatfull, 2016)
As with a majority of cluster AY phages, Raphaella encodes two tyrosine integrases. This, coupled with experimental evidence of lysogen formation by other cluster AY phages, suggests that Raphaella too is likely to establish lysogeny. We note, however, that no immunity repressor function could be predicted in Raphaella.
Nucleotide Sequence Accession numbers:
Raphaella is available at GenBank with Accession Number PP987873 and Sequence Read Archive Number SRX26311147 .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Altschul SF Gish W Miller W Myers EW Lipman DJ 1990105 Basic local alignment search tool.J Mol Biol 21530022-283640341010.1016/S 0022-2836(05)80360-22231712 · doi ↗ · pubmed ↗
- 2Besemer J Borodovsky M 200571 Gene Mark: web software for gene finding in prokaryotes, eukaryotes and viruses.Nucleic Acids Res 33Web Server issue 0305-1048 W 451W 45410.1093/nar/gki 48715980510 PMC 1160247 · doi ↗ · pubmed ↗
- 3Cresawn SG Bogel M Day N Jacobs-Sera D Hendrix RW Hatfull GF 20111012 Phamerator: a bioinformatic tool for comparative bacteriophage genomics.BMC Bioinformatics 1239539510.1186/1471-2105-12-39521991981 PMC 3233612 · doi ↗ · pubmed ↗
- 4Delcher AL Bratke KA Powers EC Salzberg SL 2007119 Identifying bacterial genes and endosymbiont DNA with Glimmer.Bioinformatics 2361367-480367367910.1093/bioinformatics/btm 00917237039 PMC 2387122 · doi ↗ · pubmed ↗
- 5Gordon D Green P 2013831 Consed: a graphical editor for next-generation sequencing.Bioinformatics 29221367-48032936293710.1093/bioinformatics/btt 51523995391 PMC 3810858 · doi ↗ · pubmed ↗
- 6Hallgren Jeppe Tsirigos Konstantinos D. Pedersen Mads Damgaard Almagro Armenteros José Juan Marcatili Paolo Nielsen Henrik Krogh Anders Winther Ole 2022410 Deep TMHMM predicts alpha and beta transmembrane proteins using deep neural networks 10.1101/2022.04.08.487609 · doi ↗
- 7Laslett D Canback B 200412 ARAGORN, a program to detect t RNA genes and tm RNA genes in nucleotide sequences.Nucleic Acids Res 3210305-1048111610.1093/nar/gkh 15214704338 PMC 373265 · doi ↗ · pubmed ↗
- 8Lowe TM Eddy SR 199731 t RN Ascan-SE: a program for improved detection of transfer RNA genes in genomic sequence.Nucleic Acids Res 2550305-104895596410.1093/nar/25.5.9559023104 PMC 146525 · doi ↗ · pubmed ↗
