Variational inference for rare variant detection in deep, heterogeneous next-generation sequencing data
Fan Zhang, Patrick Flaherty

TL;DR
This paper introduces a Bayesian variational inference method for detecting rare genetic variants in heterogeneous sequencing data, offering improved efficiency and specificity over existing algorithms, with applications in tracking variant emergence.
Contribution
The paper presents a novel Bayesian model and variational EM algorithm for rare variant detection in NGS data, demonstrating enhanced computational efficiency and specificity compared to prior methods.
Findings
Comparable sensitivity and specificity to MCMC methods
Higher specificity than existing algorithms
Early detection of beneficial variants in yeast data
Abstract
The detection of rare variants is important for understanding the genetic heterogeneity in mixed samples. Recently, next-generation sequencing (NGS) technologies have enabled the identification of single nucleotide variants (SNVs) in mixed samples with high resolution. Yet, the noise inherent in the biological processes involved in next-generation sequencing necessitates the use of statistical methods to identify true rare variants. We propose a novel Bayesian statistical model and a variational expectation-maximization (EM) algorithm to estimate non-reference allele frequency (NRAF) and identify SNVs in heterogeneous cell populations. We demonstrate that our variational EM algorithm has comparable sensitivity and specificity compared with a Markov Chain Monte Carlo (MCMC) sampling inference algorithm, and is more computationally efficient on tests of low coverage ( and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer Genomics and Diagnostics · Gene expression and cancer classification · Genomics and Phylogenetic Studies
