Representing and decomposing genomic structural variants as balanced integer flows on sequence graphs
Daniel R. Zerbino, Tracy Ballinger, Benedict Paten, Glenn Hickey and, David Haussler

TL;DR
This paper introduces a comprehensive mathematical model for genomic structural variants, capturing complex rearrangements and copy number changes, and provides methods to sample possible evolutionary histories explaining genomic differences.
Contribution
It presents a novel model of structural variation as balanced integer flows on sequence graphs, enabling ergodic sampling of evolutionary histories.
Findings
Model encompasses balanced rearrangements and CNVs.
Allows ergodic sampling of evolutionary histories.
Facilitates analysis of complex genomic variations.
Abstract
The study of genomic variation has provided key insights into the functional role of mutations. Predominantly, studies have focused on single nucleotide variants (SNV), which are relatively easy to detect and can be described with rich mathematical models. However, it has been observed that genomes are highly plastic, and that whole regions can be moved, removed or duplicated in bulk. These structural variants (SV) have been shown to have significant impact on the phenotype, but their study has been held back by the combinatorial complexity of the underlying models. We describe here a general model of structural variation that encompasses both balanced rearrangements and arbitrary copy-numbers variants (CNV). In this model, we show that the space of possible evolutionary histories that explain the structural differences between any two genomes can be sampled ergodically.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Genome Rearrangement Algorithms · Chromosomal and Genetic Variations
