Tail-Greedy Unbalanced Haar Wavelet Segmentation for Copy Number Alteration Data
Maharani Ahsani Ummi, Stuart Barber, Henry M. Wood, and Arief Gusnanto

TL;DR
This paper introduces TGUHm, a novel segmentation method for copy number alteration detection that improves accuracy for short segments in noisy sequencing data by reducing false positives and enhancing sensitivity.
Contribution
The study presents a dual-thresholding tail-greedy unbalanced Haar approach that outperforms existing methods in detecting CNAs, especially short aberrations, in noisy data.
Findings
TGUHm achieves higher true positive rates than CBS, HaarSeg, and FDRSeg.
The method reduces false positives effectively in simulated noisy conditions.
Application to real cancer data reveals biologically relevant CNAs.
Abstract
Detecting copy number alterations (CNAs) from next-generation sequencing data remains challenging, particularly for short segments under noisy conditions. Existing segmentation methods often suffer from high false positive rates or fail to reliably detect short aberrations, especially in low-coverage data. In this study, we propose a modified tail-greedy unbalanced Haar (TGUHm) method that introduces a dual-thresholding strategy to improve segmentation accuracy. The proposed approach effectively suppresses spurious spikes while preserving sensitivity to both short and long CNA segments. Extensive simulation studies under Gaussian and heavy-tailed noise demonstrate that TGUHm consistently achieves higher true positive rates and lower false positive rates compared to state-of-the-art methods, including CBS, HaarSeg, and FDRSeg. In particular, the proposed method improves detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
