SALIENT: Frequency-Aware Paired Diffusion for Controllable Long-Tail CT Detection
Yifan Li, Mehrdad Salimitari, Taiyu Zhang, Guang Li, David Dreizin

TL;DR
SALIENT introduces a frequency-aware diffusion framework for controllable augmentation of long-tail CT detection, improving realism and detection performance by synthesizing paired lesion-masking volumes in wavelet space.
Contribution
It proposes a novel wavelet-domain diffusion method with frequency-aware objectives and paired supervision for controllable, efficient augmentation in long-tail CT detection.
Findings
Higher MS-SSIM (0.63 to 0.83) indicating improved realism
Lower FID (118.4 to 46.5) showing better quality
Enhanced detection performance, especially at low prevalences
Abstract
Detection of rare lesions in whole-body CT is fundamentally limited by extreme class imbalance and low target-to-volume ratios, producing precision collapse despite high AUROC. Synthetic augmentation with diffusion models offers promise, yet pixel-space diffusion is computationally expensive, and existing mask-conditioned approaches lack controllable attribute-level regulation and paired supervision for accountable training. We introduce SALIENT, a mask-conditioned wavelet-domain diffusion framework that synthesizes paired lesion-masking volumes for controllable CT augmentation under long-tail regimes. Instead of denoising in pixel space, SALIENT performs structured diffusion over discrete wavelet coefficients, explicitly separating low-frequency brightness from high-frequency structural detail. Learnable frequency-aware objectives disentangle target and background attributes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Advanced X-ray and CT Imaging · Radiation Dose and Imaging
