Calibrating Undisciplined Over-Smoothing in Transformer for Weakly Supervised Semantic Segmentation

Lechao Cheng; Zerun Liu; Jingxuan He; Chaowei Fang; Dingwen Zhang; Meng Wang

arXiv:2305.03112·cs.CV·May 30, 2025·6 cites

Calibrating Undisciplined Over-Smoothing in Transformer for Weakly Supervised Semantic Segmentation

Lechao Cheng, Zerun Liu, Jingxuan He, Chaowei Fang, Dingwen Zhang, Meng Wang

PDF

Open Access

TL;DR

This paper introduces AReAM, an entropy-aware mechanism that calibrates deep-level attention in transformer models for weakly supervised semantic segmentation, reducing over-smoothing and improving segmentation accuracy.

Contribution

It proposes an adaptive re-activation mechanism that leverages shallow-level affinity to regulate deep-layer attention, addressing over-smoothing in transformer-based WSSS.

Findings

01

AReAM significantly improves segmentation performance on benchmark datasets.

02

The method reduces background noise and sharpens focus on relevant regions.

03

Experiments demonstrate better CAM refinement compared to existing methods.

Abstract

Weakly supervised semantic segmentation (WSSS) has recently attracted considerable attention because it requires fewer annotations than fully supervised approaches, making it especially promising for large-scale image segmentation tasks. Although many vision transformer-based methods leverage self-attention affinity matrices to refine Class Activation Maps (CAMs), they often treat each layer's affinity equally and thus introduce considerable background noise at deeper layers, where attention tends to converge excessively on certain tokens (i.e., over-smoothing). We observe that this deep-level attention naturally converges on a subset of tokens, yet unregulated query-key affinity can generate unpredictable activation patterns (undisciplined over-smoothing), adversely affecting CAM accuracy. To address these limitations, we propose an Adaptive Re-Activation Mechanism (AReAM), which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM