End-to-End Optimization of High Throughput DNA Sequencing
Eliza O'Reilly, Francois Baccelli, Gustavo de Veciana, Haris Vikalo

TL;DR
This paper introduces a stochastic geometry-based framework to model and optimize the entire high throughput DNA sequencing process, linking physical cluster formation to genome reconstruction success.
Contribution
It presents the first comprehensive model connecting physical cluster formation with computational genome assembly in sequencing platforms.
Findings
Provides a framework for analyzing sequencing cost and performance
Models the physical and computational stages of sequencing as an integrated process
Enables optimization of sequencing parameters for improved accuracy
Abstract
At the core of high throughput DNA sequencing platforms lies a bio-physical surface process that results in a random geometry of clusters of homogenous short DNA fragments typically hundreds of base pairs long - bridge amplification. The statistical properties of this random process and length of the fragments are critical as they affect the information that can be subsequently extracted, i.e., density of successfully inferred DNA fragment reads. The ensemble of overlapping DNA fragment reads are then used to computationally reconstruct the much longer target genome sequence, e.g, ranging from hundreds of thousands to billions of base pairs. The success of the reconstruction in turn depends on having a sufficiently large ensemble of DNA fragments that are sufficiently long. In this paper using stochastic geometry we model and optimize the end-to-end process linking and partially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
