TL;DR
This paper introduces a Bayesian nonparametric approach to optimize resource allocation in genome sequencing studies, improving predictions of new variants and enabling better experimental planning.
Contribution
The paper presents a novel Bayesian nonparametric methodology that predicts variant discovery and guides resource allocation, accommodating changes in experimental conditions.
Findings
More accurate variant prediction than recent methods
Effective in optimizing budget allocation between quality and quantity
Applicable to real-world genomic data from gnomAD
Abstract
While the cost of sequencing genomes has decreased dramatically in recent years, this expense often remains non-trivial. Under a fixed budget, then, scientists face a natural trade-off between quantity and quality; they can spend resources to sequence a greater number of genomes (quantity) or spend resources to sequence genomes with increased accuracy (quality). Our goal is to find the optimal allocation of resources between quantity and quality. Optimizing resource allocation promises to reveal as many new variations in the genome as possible, and thus as many new scientific insights as possible. In this paper, we consider the common setting where scientists have already conducted a pilot study to reveal variants in a genome and are contemplating a follow-up study. We introduce a Bayesian nonparametric methodology to predict the number of new variants in the follow-up study based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
