Genome-wide nucleotide-resolution model of single-strand break site reveals species evolutionary hierarchy
Sheng Xu, Junkang Wei, Yu Li

TL;DR
This paper introduces SSBlazer, a scalable deep learning model for predicting single-strand break sites in genomes, revealing evolutionary insights across 216 vertebrate species.
Contribution
The study presents the first computational method for nucleotide-resolution SSB prediction that is explainable, scalable, and capable of cross-species analysis.
Findings
SSBlazer accurately predicts SSB sites with reduced false positives.
The model captures key genomic features like CpG patterns and motifs.
SSBlazer generalizes well across diverse species.
Abstract
Single-strand breaks (SSBs) are the major DNA damage in the genome arising spontaneously as the outcome of genotoxins and intermediates of DNA transactions. SSBs play a crucial role in various biological processes and show a non-random distribution in the genome. Several SSB detection approaches such as S1 END-seq and SSiNGLe-ILM emerged to characterize the genomic landscape of SSB with nucleotide resolution. However, these sequencing-based methods are costly and unfeasible for large-scale analysis of diverse species. Thus, we proposed the first computational approach, SSBlazer, which is an explainable and scalable deep learning framework for genome-wide nucleotide-resolution SSB site prediction. We demonstrated that SSBlazer can accurately predict SSB sites and sufficiently alleviate false positives by constructing an imbalanced dataset to simulate the realistic SSB distribution. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Chromosomal and Genetic Variations · Genetic diversity and population structure
