PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation
Cilin Yan, Haochen Wang, Jie Liu, Xiaolong Jiang, Yao Hu, Xu Tang,, Guoliang Kang, Efstratios Gavves

TL;DR
PiClick introduces a Transformer-based interactive segmentation method that generates multiple candidate masks and automatically suggests the most plausible one, reducing ambiguity and human effort in pixel-level annotation.
Contribution
The paper presents PiClick, a novel network that produces all potential masks and automatically selects the most suitable, addressing target ambiguity in click-based segmentation.
Findings
Outperforms previous methods on 9 datasets.
Effectively reduces human effort in mask selection.
Generates multiple masks for ambiguous scenes.
Abstract
Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing. In such a task, target ambiguity remains a problem hindering the accuracy and efficiency of segmentation. That is, in scenes with rich context, one click may correspond to multiple potential targets, while most previous interactive segmentors only generate a single mask and fail to deal with target ambiguity. In this paper, we propose a novel interactive segmentation network named PiClick, to yield all potentially reasonable masks and suggest the most plausible one for the user. Specifically, PiClick utilizes a Transformer-based architecture to generate all potential target masks by mutually interactive mask queries. Moreover, a Target Reasoning module(TRM) is designed in PiClick to automatically suggest the user-desired mask from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Multimodal Machine Learning Applications · Advanced Neural Network Applications
Methodsfail
