SynSeg: Feature Synergy for Multi-Category Contrastive Learning in End-to-End Open-Vocabulary Semantic Segmentation

Weichen Zhang; Kebin Liu; Fan Dang; Zhui Zhu; Xikai Sun; Yunhao Liu

arXiv:2508.06115·cs.CV·November 18, 2025

SynSeg: Feature Synergy for Multi-Category Contrastive Learning in End-to-End Open-Vocabulary Semantic Segmentation

Weichen Zhang, Kebin Liu, Fan Dang, Zhui Zhu, Xikai Sun, Yunhao Liu

PDF

Open Access

TL;DR

SynSeg introduces a novel weakly-supervised framework utilizing multi-category contrastive learning and feature reconstruction to enhance open-vocabulary semantic segmentation, achieving state-of-the-art results efficiently.

Contribution

The paper proposes SynSeg, a lightweight end-to-end method that employs multi-category contrastive learning and feature synergy structure for improved weakly-supervised semantic segmentation.

Findings

01

Outperforms state-of-the-art methods on benchmarks.

02

Achieves 6.9% to 26.2% higher accuracy than baselines.

03

Effectively improves semantic localization and discrimination.

Abstract

Semantic segmentation in open-vocabulary scenarios presents significant challenges due to the wide range and granularity of semantic categories. Existing weakly-supervised methods often rely on category-specific supervision and ill-suited feature construction methods for contrastive learning, leading to semantic misalignment and poor performance. In this work, we propose a novel weakly-supervised approach, SynSeg, to address the challenges. SynSeg performs Multi-Category Contrastive Learning (MCCL) as a stronger training signal with a new feature reconstruction framework named Feature Synergy Structure (FSS). Specifically, MCCL strategy robustly combines both intra- and inter-category alignment and separation in order to make the model learn the knowledge of correlations from different categories within the same image. Moreover, FSS reconstructs discriminative features for contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification