DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
Lukas Hoyer, Dengxin Dai, Luc Van Gool

TL;DR
DAFormer introduces a Transformer-based architecture with novel training strategies for unsupervised domain adaptation in semantic segmentation, significantly improving performance over previous methods and effectively handling difficult classes.
Contribution
It systematically benchmarks network architectures for UDA and proposes DAFormer, a Transformer-based model with training strategies that enhance domain adaptation performance.
Findings
DAFormer outperforms previous methods by 10.8 mIoU on GTA-to-Cityscapes.
It achieves a 5.4 mIoU improvement on Synthia-to-Cityscapes.
The model effectively learns difficult classes like train, bus, and truck.
Abstract
As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations. This process is studied in unsupervised domain adaptation (UDA). Even though a large number of methods propose new adaptation strategies, they are mostly based on outdated network architectures. As the influence of recent network architectures has not been systematically studied, we first benchmark different network architectures for UDA and newly reveal the potential of Transformers for UDA semantic segmentation. Based on the findings, we propose a novel UDA method, DAFormer. The network architecture of DAFormer consists of a Transformer encoder and a multi-level context-aware feature fusion decoder. It is enabled by three simple but crucial training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Label Smoothing · Dense Connections · Absolute Position Encodings · Softmax · Residual Connection
