TL;DR
LoMix introduces a learnable, differentiable module for fusing multi-scale logits in medical image segmentation, improving accuracy and data efficiency without increasing inference cost.
Contribution
It proposes a NAS-inspired, plug-and-play logits mixing module that automatically discovers optimal scale combinations and fusion operators for improved segmentation.
Findings
Improves DICE scores by up to 13.5% across benchmarks.
Enhances performance significantly when training data is scarce.
Operates with zero inference overhead.
Abstract
U-shaped networks output logits at multiple spatial scales, each capturing a different blend of coarse context and fine detail. Yet, training still treats these logits in isolation - either supervising only the final, highest-resolution logits or applying deep supervision with identical loss weights at every scale - without exploring mixed-scale combinations. Consequently, the decoder output misses the complementary cues that arise only when coarse and fine predictions are fused. To address this issue, we introduce LoMix (Logits Mixing), a NAS-inspired, differentiable plug-and-play module that generates new mixed-scale outputs and learns how exactly each of them should guide the training process. More precisely, LoMix mixes the multi-scale decoder logits with four lightweight fusion operators: addition, multiplication, concatenation, and attention-based weighted fusion, yielding a rich…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
