TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation
Ruiwen Li, Zheda Mai, Chiheb Trabelsi, Zhibo Zhang, Jongseong Jang,, Scott Sanner

TL;DR
This paper introduces TransCAM, a novel transformer attention-based method that refines class activation maps in weakly supervised semantic segmentation, significantly improving accuracy by leveraging multi-level attention features from a Conformer model.
Contribution
The paper proposes TransCAM, a simple yet effective approach that uses transformer attention weights from different layers to enhance CAM quality in WSSS, achieving state-of-the-art results.
Findings
TransCAM achieves 69.3% on PASCAL VOC 2012 validation set.
Refinement using transformer attention improves CAM completeness.
Multi-level attention captures both low-level and high-level features.
Abstract
Weakly supervised semantic segmentation (WSSS) with only image-level supervision is a challenging task. Most existing methods exploit Class Activation Maps (CAM) to generate pixel-level pseudo labels for supervised training. However, due to the local receptive field of Convolution Neural Networks (CNN), CAM applied to CNNs often suffers from partial activation -- highlighting the most discriminative part instead of the entire object area. In order to capture both local features and global representations, the Conformer has been proposed to combine a visual transformer branch with a CNN branch. In this paper, we propose TransCAM, a Conformer-based solution to WSSS that explicitly leverages the attention weights from the transformer branch of the Conformer to refine the CAM generated from the CNN branch. TransCAM is motivated by our observation that attention weights from shallow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsClass-activation map · Convolution
