$\textit{Don't Guess, Just Ask}$: Resolving Ambiguity in Referring Segmentation via Multi-turn Clarification

Yuting Yang; Haichao Jiang; Tianming Liang; Quan Zhang; Jian-Fang Hu

arXiv:2605.17531·cs.CV·May 19, 2026

$\textit{Don't Guess, Just Ask}$: Resolving Ambiguity in Referring Segmentation via Multi-turn Clarification

Yuting Yang, Haichao Jiang, Tianming Liang, Quan Zhang, Jian-Fang Hu

PDF

1 Repo

TL;DR

This paper introduces IC-Seg, a framework that resolves ambiguous referring segmentation queries through multi-turn clarification, improving accuracy and reducing redundant interactions.

Contribution

The paper proposes a novel agentic framework with hierarchical optimization for proactive clarification in referring segmentation tasks.

Findings

01

IC-Seg significantly outperforms existing methods on ambiguous query resolution.

02

The hierarchical optimization strategy enhances dialogue efficiency and segmentation accuracy.

03

IC-Seg maintains state-of-the-art performance on standard benchmarks.

Abstract

Referring segmentation aims to segment the target objects in images or videos based on the textual query. Despite remarkable progress over the past years, existing works always assume that the user-provided queries are already precise and clear. However, this assumption is impractical. In real-world scenarios, it is unrealistic to expect all users to thoroughly review their visual content and carefully ensure their queries are unique and unambiguous. When encountering such cases, existing segmentation models tend to arbitrarily guess the user preferences, often resulting in undesired outcomes. To address this limitation, we propose \textbf{IC-Seg}, a novel agentic framework that proactively clarifies user intent through multi-turn conversation before segmentation. To effectively incentivize this capability, we further introduce \textbf{Hi-GRPO}, a new hierarchical optimization strategy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iSEE-Laboratory/IC-Seg
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.