A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
Heng Li

TL;DR
This paper introduces a statistical framework for analyzing sequencing data directly to call SNPs, discover mutations, and estimate population parameters without needing explicit genotyping, suitable for low-coverage and somatic mutation studies.
Contribution
It presents a novel statistical approach that bypasses traditional genotyping, enabling accurate analysis of sequencing data with uncertainty in various genetic applications.
Findings
Achieves comparable accuracy to existing methods in SNP calling and association mapping.
Highlights the importance of symmetric datasets for somatic mutation detection.
Identifies mismapping as a major source of errors in rare event discovery.
Abstract
Motivation: Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping. We also highlight the necessity of using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
