Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic   Segmentation

Runfa Chen; Yu Rong; Shangmin Guo; Jiaqi Han; Fuchun Sun; Tingyang Xu,; Wenbing Huang

arXiv:2203.07988·cs.CV·March 16, 2022·1 cites

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

Runfa Chen, Yu Rong, Shangmin Guo, Jiaqi Han, Fuchun Sun, Tingyang Xu,, Wenbing Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a momentum-based smoothing technique and dynamic discrepancy measurement to improve domain adaptive semantic segmentation with Vision Transformers, addressing high-frequency noise issues and enhancing transferability.

Contribution

It proposes a novel momentum network and dynamic discrepancy measurement to enhance the transferability of local ViTs in domain adaptive semantic segmentation.

Findings

01

Outperforms state-of-the-art methods on sim2real benchmarks

02

Effectively reduces high-frequency noise in features and pseudo labels

03

Improves the transferability of local Vision Transformers

Abstract

After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation. Unfortunately, straightforwardly applying local ViTs in domain adaptive semantic segmentation does not bring in expected improvement. We find that the pitfall of local ViTs is due to the severe high-frequency components generated during both the pseudo-label construction and features alignment for target domains. These high-frequency components make the training of local ViTs very unsmooth and hurt their transferability. In this paper, we introduce a low-pass filtering mechanism, momentum network, to smooth the learning dynamics of target domain features and pseudo labels. Furthermore, we propose a dynamic of discrepancy measurement to align the distributions in the source and target domains via dynamic weights to evaluate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alpc91/transda
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Face recognition and analysis

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dropout · Layer Normalization · Adam · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer