PopIns: population-scale detection of novel sequence insertions
Birte Kehr, P\'all Melsted, Bjarni V. Halld\'orsson

TL;DR
PopIns is a computational tool designed to detect and characterize novel sequence insertions of at least 100 base pairs across populations, improving accuracy and efficiency over previous methods using short read sequencing data.
Contribution
The paper introduces PopIns, a new method that combines assembly, merging, anchoring, and genotyping to detect population-scale novel insertions more accurately.
Findings
PopIns outperforms MindTheGap in recall and precision.
Merging contigs enhances insertion prediction quality.
Preliminary tests on 305 Icelanders demonstrate feasibility.
Abstract
The detection of genomic structural variation (SV) has advanced tremendously in recent years due to progress in high-throughput sequencing technologies. Novel sequence insertions, insertions without similarity to a human reference genome, have received less attention than other types of SVs due to the computational challenges in their detection from short read sequencing data, which inherently involves de novo assembly. De novo assembly is not only computationally challenging, but also requires high-quality data. While the reads from a single individual may not always meet this requirement, using reads from multiple individuals can increase power to detect novel insertions. We have developed the program PopIns, which can discover and characterize non-reference insertions of 100 bp or longer on a population scale. In this paper, we describe the approach we implemented in PopIns. It takes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Chromosomal and Genetic Variations
