Context Patch Fusion With Class Token Enhancement for Weakly Supervised Semantic Segmentation

Yiyang Fu; Hui Li; Wangyu Wu

arXiv:2601.14718·cs.CV·January 22, 2026

Context Patch Fusion With Class Token Enhancement for Weakly Supervised Semantic Segmentation

Yiyang Fu, Hui Li, Wangyu Wu

PDF

Open Access

TL;DR

This paper introduces CPF-CTE, a novel framework for weakly supervised semantic segmentation that leverages contextual patch relations and class token enhancements to improve feature representation and segmentation accuracy.

Contribution

The proposed CPF-CTE framework uniquely combines bidirectional LSTM-based contextual fusion with learnable class tokens to better capture spatial and semantic dependencies in WSSS.

Findings

01

Outperforms previous WSSS methods on PASCAL VOC 2012

02

Achieves higher segmentation accuracy on MS COCO 2014

03

Enhances feature representation through contextual and semantic integration

Abstract

Weakly Supervised Semantic Segmentation (WSSS), which relies only on image-level labels, has attracted significant attention for its cost-effectiveness and scalability. Existing methods mainly enhance inter-class distinctions and employ data augmentation to mitigate semantic ambiguity and reduce spurious activations. However, they often neglect the complex contextual dependencies among image patches, resulting in incomplete local representations and limited segmentation accuracy. To address these issues, we propose the Context Patch Fusion with Class Token Enhancement (CPF-CTE) framework, which exploits contextual relations among patches to enrich feature representations and improve segmentation. At its core, the Contextual-Fusion Bidirectional Long Short-Term Memory (CF-BiLSTM) module captures spatial dependencies between patches and enables bidirectional information flow, yielding a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis