Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation
Jian Hu, Jiayi Lin, Junchi Yan, Shaogang Gong

TL;DR
This paper introduces ProMaC, a novel framework that leverages hallucinations from large language models to improve prompt-based segmentation by iteratively refining prompts and masks, reducing manual effort and enhancing accuracy.
Contribution
It proposes a new iterative cycle framework that uses hallucinations to extract and verify task-related information, significantly improving prompt quality in promptable segmentation.
Findings
ProMaC outperforms existing methods on 5 benchmarks.
Hallucination-based prompts lead to more accurate segmentation masks.
Iterative refinement reduces irrelevant hallucinations and improves task focus.
Abstract
Promptable segmentation typically requires instance-specific manual prompts to guide the segmentation of each desired object. To minimize such a need, task-generic promptable segmentation has been introduced, which employs a single task-generic prompt to segment various images of different objects in the same task. Current methods use Multimodal Large Language Models (MLLMs) to reason detailed instance-specific prompts from a task-generic prompt for improving segmentation accuracy. The effectiveness of this segmentation heavily depends on the precision of these derived prompts. However, MLLMs often suffer hallucinations during reasoning, resulting in inaccurate prompting. While existing methods focus on eliminating hallucinations to improve a model, we argue that MLLM hallucinations can reveal valuable contextual insights when leveraged correctly, as they represent pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEpilepsy research and treatment · EEG and Brain-Computer Interfaces · Pain Management and Placebo Effect
MethodsGeneralizable SAM · Segment Anything Model
