Improving sequence-based genotype calls with linkage disequilibrium and pedigree information
Baiyu Zhou, Alice S. Whittemore

TL;DR
This paper introduces likelihood-based methods that enhance genotype calling accuracy from sequencing data by leveraging linkage disequilibrium and pedigree information, outperforming traditional methods that treat loci independently.
Contribution
The paper presents novel likelihood-based approaches that incorporate LD and pedigree data into genotype calling, improving accuracy over existing independent-locus methods.
Findings
Methods outperform traditional approaches in simulations
Incorporating LD improves genotype accuracy
Pedigree information further enhances calls
Abstract
Whole and targeted sequencing of human genomes is a promising, increasingly feasible tool for discovering genetic contributions to risk of complex diseases. A key step is calling an individual's genotype from the multiple aligned short read sequences of his DNA, each of which is subject to nucleotide read error. Current methods are designed to call genotypes separately at each locus from the sequence data of unrelated individuals. Here we propose likelihood-based methods that improve calling accuracy by exploiting two features of sequence data. The first is the linkage disequilibrium (LD) between nearby SNPs. The second is the Mendelian pedigree information available when related individuals are sequenced. In both cases the likelihood involves the probabilities of read variant counts given genotypes, summed over the unobserved genotypes. Parameters governing the prior genotype…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
