CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance
Andrew Jesson, Nicolas Guizard, Sina Hamidi Ghalehjegh, Damien Goblot,, Florian Soudan, Nicolas Chapados

TL;DR
The paper presents CASED, a curriculum sampling algorithm that significantly improves deep learning segmentation models on highly imbalanced datasets, achieving state-of-the-art results in lung nodule detection with minimal detection stages.
Contribution
Introduces CASED, a novel curriculum sampling method that enhances training of deep segmentation models on imbalanced data, outperforming traditional two-stage detection approaches.
Findings
Achieves 88.35% sensitivity on LUNA16 benchmark.
Models trained with CASED are robust to annotation quality.
Framework generalizes across imaging modalities and segmentation targets.
Abstract
We introduce CASED, a novel curriculum sampling algorithm that facilitates the optimization of deep learning segmentation or detection models on data sets with extreme class imbalance. We evaluate the CASED learning framework on the task of lung nodule detection in chest CT. In contrast to two-stage solutions, wherein nodule candidates are first proposed by a segmentation model and refined by a second detection stage, CASED improves the training of deep nodule segmentation models (e.g. UNet) to the point where state of the art results are achieved using only a trivial detection stage. CASED improves the optimization of deep segmentation models by allowing them to first learn how to distinguish nodules from their immediate surroundings, while continuously adding a greater proportion of difficult-to-classify global context, until uniformly sampling from the empirical data distribution.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
