Improved haplotyping of rare variants using next-generation sequence data
Fouad Zakharia, Carlos Bustamante

TL;DR
This paper introduces a novel statistical method that integrates paired-end sequencing read data with population genotype data to significantly improve haplotype phasing accuracy, especially for rare variants, in human genomes.
Contribution
The method extends shapeIT by incorporating sequencing read information, enhancing haplotype reconstruction accuracy in unrelated individuals using next-generation sequencing data.
Findings
Significant reduction in switch error rates (4-15%) across different panels.
Greater improvements observed with longer reads and higher throughput.
Enhanced accuracy in phasing rare variants using the new approach.
Abstract
Accurate identification of haplotypes in sequenced human genomes can provide invaluable information about population demography and fine-scale correlations along the genome, thus empowering both population genomic and medical association studies. Yet phasing unrelated individuals remains a challenging problem. Incorporating available data from high throughput sequencing into traditional statistical phasing approaches is a promising avenue to alleviate these issues. We present a novel statistical method that expands on an existing graphical haplotype reconstruction method (shapeIT) to incorporate phasing information from paired-end read data. The algorithm harnesses the haplotype graph information estimated by shapeIT from genotypes across the population and refines haplotype likelihoods for a given individual to be compatible with the sequencing data. Applying the method to HapMap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Gene expression and cancer classification · Genomics and Rare Diseases
