CoCo-SAM3: Harnessing Concept Conflict in Open-Vocabulary Semantic Segmentation

Yanhui Chen; Baoyao Yang; Siqi Liu; Jingchao Wang

arXiv:2604.19648·cs.CV·April 22, 2026

CoCo-SAM3: Harnessing Concept Conflict in Open-Vocabulary Semantic Segmentation

Yanhui Chen, Baoyao Yang, Siqi Liu, Jingchao Wang

PDF

TL;DR

CoCo-SAM3 improves open-vocabulary semantic segmentation by aligning evidence from synonyms and enabling direct pixel-wise class comparison, reducing conflicts and enhancing stability without extra training.

Contribution

It introduces a decoupled inference framework that aligns synonyms and performs unified inter-class competition, addressing conflicts in multi-class open-vocabulary segmentation.

Findings

01

Achieves consistent improvements across eight benchmarks.

02

Effectively mitigates inter-class conflicts and intra-class drift.

03

Enhances inference stability without additional training.

Abstract

SAM3 advances open-vocabulary semantic segmentation by introducing a prompt-driven mask generation paradigm. However, in multi-class open-vocabulary scenarios, masks generated independently from different category prompts lack a unified and inter-class comparable evidence scale, often resulting in overlapping coverage and unstable competition. Moreover, synonymous expressions of the same concept tend to activate inconsistent semantic and spatial evidence, leading to intra-class drift that exacerbates inter-class conflicts and compromises overall inference stability. To address these issues, we propose CoCo-SAM3 (Concept-Conflict SAM3), which explicitly decouples inference into intra-class enhancement and inter-class competition. Our method first aligns and aggregates evidence from synonymous prompts to strengthen concept consistency. It then performs inter-class competition on a unified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.