Improving CNV Detection Performance Except for Software-Specific Problematic Regions
Jinha Hwang, Jung Hye Byeon, Baik-Lin Eun, Myung-Hyun Nam, Yunjung Cho, Seung Gyu Yun

TL;DR
This study improves the accuracy of detecting copy number variations from whole exome sequencing data by identifying and filtering out problematic genomic regions specific to different software tools.
Contribution
The study introduces a method to identify and filter software-specific problematic regions in WES-based CNV detection, significantly reducing false positives.
Findings
Software-specific problematic regions were identified across the WES cohort, affecting 1.23% of sequencing target baits.
ExomeDepth showed notable improvements in sensitivity and positive predictive value after filtration of problematic regions.
Targeted filtration reduced false positives and improved overall performance of CNV detection tools.
Abstract
Background/Objectives: Whole exome sequencing (WES) is an effective method for detecting disease-causing variants. However, copy number variation (CNV) detection using WES data often has limited sensitivity and high false-positive rates. Methods: In this study, we constructed a reference CNV set using chromosomal microarray analysis (CMA) data from 44 of 180 individuals who underwent WES and CMA and evaluated four WES-based CNV callers (CNVkit, CoNIFER, ExomeDepth, and cn.MOPS) against this benchmark. For each tool, we first defined software-specific problematic genomic regions across the full WES cohort and filtered out the CNVs that overlapped these regions. Results: The four algorithms showed low mutual concordance and distinct distributions in the problematic regions. On average, 2210 sequencing target baits (1.23%) were classified as problematic; these baits had lower mappability…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Genomic variations and chromosomal abnormalities · Genetic Associations and Epidemiology
