GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging
Xiangyu Li, Ran Su

TL;DR
GCE-MIL introduces a novel framework for multiple instance learning in whole-slide imaging that explicitly optimizes evidence quality, improving interpretability and performance across multiple datasets and models.
Contribution
It proposes a backbone-agnostic wrapper with mechanisms for evidence grounding, differentiable evidence search, and discrete evidence recovery, addressing limitations of attention-based MIL models.
Findings
Improves Macro-F1 by 0.024 and C-index by 0.014 across datasets.
Reduces the gap between continuous and discrete evidence selection by 4-7.
Inference speed increases up to 5 times with minimal utility loss.
Abstract
Multiple instance learning (MIL) is the standard approach for whole-slide image (WSI) classification and survival prediction, where attention-based models ag gregate patch features into slide-level predictions. These models treat attention weights as evidence for their predictions, but attention is optimized for classi fication, not for identifying which patches actually support the diagnosis. This conflation leads to three failures: selected patches are insufficient (keeping them alone drops Macro-F1 by 0.078), unnecessary (removing them barely changes the prediction), and unrecoverable (continuous attention scores disagree with discrete patch subsets used at inference). The central premise is that evidence quality should be optimized directly through explicit criteria- Sufficiency, Necessity, and Recov erability (S/N/R)- rather than inherited as a byproduct of classification. GCE-MIL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
