TransAdapter: Vision Transformer for Feature-Centric Unsupervised Domain Adaptation
A. Enes Doruk, Erhan Oztop, Hasan F. Ates

TL;DR
This paper introduces TransAdapter, a vision transformer-based method for unsupervised domain adaptation that improves domain alignment and generalization without task-specific modules, achieving state-of-the-art results.
Contribution
It proposes a novel UDA approach using Swin Transformer with three modules: Graph Domain Discriminator, Adaptive Double Attention, and Cross-Feature Transform, enhancing domain adaptation capabilities.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Effectively captures local and global dependencies in domain adaptation.
No task-specific alignment modules needed.
Abstract
Unsupervised Domain Adaptation (UDA) aims to utilize labeled data from a source domain to solve tasks in an unlabeled target domain, often hindered by significant domain gaps. Traditional CNN-based methods struggle to fully capture complex domain relationships, motivating the shift to vision transformers like the Swin Transformer, which excel in modeling both local and global dependencies. In this work, we propose a novel UDA approach leveraging the Swin Transformer with three key modules. A Graph Domain Discriminator enhances domain alignment by capturing inter-pixel correlations through graph convolutions and entropy-based attention differentiation. An Adaptive Double Attention module combines Windows and Shifted Windows attention with dynamic reweighting to align long-range and local features effectively. Finally, a Cross-Feature Transform modifies Swin Transformer blocks to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsStochastic Depth · Linear Layer · Swin Transformer · ALIGN · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Dropout · Byte Pair Encoding
