TL;DR
This paper introduces a novel two-stage active learning method for semantic segmentation that uses diffusion models to select the most informative pixels, significantly reducing labeling costs while maintaining high accuracy.
Contribution
We propose a diffusion-based two-stage active learning pipeline that effectively balances diversity and uncertainty for low-budget semantic segmentation.
Findings
Outperforms existing methods on four benchmark datasets.
Achieves high segmentation accuracy with minimal labeled pixels.
Demonstrates effectiveness under extreme pixel-budget constraints.
Abstract
Semantic segmentation demands dense pixel-level annotations, which can be prohibitively expensive - especially under extremely constrained labeling budgets. In this paper, we address the problem of low-budget active learning for semantic segmentation by proposing a novel two-stage selection pipeline. Our approach leverages a pre-trained diffusion model to extract rich multi-scale features that capture both global structure and fine details. In the first stage, we perform a hierarchical, representation-based candidate selection by first choosing a small subset of representative pixels per image using MaxHerding, and then refining these into a diverse global pool. In the second stage, we compute an entropy-augmented disagreement score (eDALD) over noisy multi-scale diffusion features to capture both epistemic uncertainty and prediction confidence, selecting the most informative pixels for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
