BALD-SAM: Disagreement-based Active Prompting in Interactive Segmentation
Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib

TL;DR
BALD-SAM introduces an active learning approach for interactive image segmentation that uses model uncertainty to select the most informative prompts, significantly improving performance across diverse datasets.
Contribution
This work pioneers the use of Bayesian active learning with disagreement (BALD) for spatial prompt selection in interactive segmentation, enabling automated, uncertainty-driven prompt placement.
Findings
Outperforms human prompting and oracle prompts in several benchmarks.
Achieves top or second-best performance on 14 out of 16 datasets.
Surpasses one-shot baselines, especially on complex objects.
Abstract
The Segment Anything Model (SAM) has revolutionized interactive segmentation through spatial prompting. While existing work primarily focuses on automating prompts in various settings, real-world annotation workflows involve iterative refinement where annotators observe model outputs and strategically place prompts to resolve ambiguities. Current pipelines typically rely on the annotator's visual assessment of the predicted mask quality. We postulate that a principled approach for automated interactive prompting is to use a model-derived criterion to identify the most informative region for the next prompt. In this work, we establish active prompting: a spatial active learning approach where locations within images constitute an unlabeled pool and prompts serve as queries to prioritize information-rich regions, increasing the utility of each interaction. We further present BALD-SAM: a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
