A Note on Optimal Sampling Strategy for Structural Variant Detection Using Optical Mapping
Weiwei Li, Jan Hannig, Corbin Jones

TL;DR
This paper develops an optimization method for sampling strategies in optical mapping to efficiently detect structural variants in human genomes, balancing sample size and detection confidence.
Contribution
It introduces a novel, analytically tractable approach using hyper-geometric models and concentration inequalities to determine optimal sampling in optical mapping.
Findings
Sampling most chromosomal fragments reduces biological material needed for high-confidence detection.
The proposed method is computationally efficient and provides tail bounds for hyper-geometric distributions.
Optimal sampling strategies improve detection efficiency in genomic structural variant analysis.
Abstract
Structural variants compose the majority of human genetic variation, but are difficult to assess using current genomic sequencing technologies. Optical mapping technologies, which measure the size of chromosomal fragments between labeled markers, offer an alternative approach. As these technologies mature towards becoming clinical tools, there is a need to develop an approach for determining the optimal strategy for sampling biological material in order to detect a variant at some threshold. Here we develop an optimization approach using a simple, yet realistic, model of the genomic mapping process using a hyper-geometric distribution and {probabilistic} concentration inequalities. Our approach is both computationally and analytically tractable and includes a novel approach to getting tail bounds of hyper-geometric distribution. We show that if a genomic mapping technology can sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · Genomics and Rare Diseases · Biomedical Text Mining and Ontologies
