DAFormer: Improving Network Architectures and Training Strategies for   Domain-Adaptive Semantic Segmentation

Lukas Hoyer; Dengxin Dai; Luc Van Gool

arXiv:2111.14887·cs.CV·March 30, 2022·1 cites

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

Lukas Hoyer, Dengxin Dai, Luc Van Gool

PDF

Open Access 3 Repos

TL;DR

DAFormer introduces a Transformer-based architecture with novel training strategies for unsupervised domain adaptation in semantic segmentation, significantly improving performance over previous methods and effectively handling difficult classes.

Contribution

It systematically benchmarks network architectures for UDA and proposes DAFormer, a Transformer-based model with training strategies that enhance domain adaptation performance.

Findings

01

DAFormer outperforms previous methods by 10.8 mIoU on GTA-to-Cityscapes.

02

It achieves a 5.4 mIoU improvement on Synthia-to-Cityscapes.

03

The model effectively learns difficult classes like train, bus, and truck.

Abstract

As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations. This process is studied in unsupervised domain adaptation (UDA). Even though a large number of methods propose new adaptation strategies, they are mostly based on outdated network architectures. As the influence of recent network architectures has not been systematically studied, we first benchmark different network architectures for UDA and newly reveal the potential of Transformers for UDA semantic segmentation. Based on the findings, we propose a novel UDA method, DAFormer. The network architecture of DAFormer consists of a Transformer encoder and a multi-level context-aware feature fusion decoder. It is enabled by three simple but crucial training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Label Smoothing · Dense Connections · Absolute Position Encodings · Softmax · Residual Connection