Dynamically Decoding Source Domain Knowledge for Domain Generalization
Cuicui Kang, Karthik Nandakumar

TL;DR
This paper introduces a Transformer-based approach that dynamically decodes source domain knowledge during inference, significantly improving domain generalization performance on multiple benchmarks.
Contribution
It proposes a novel Transformer framework with domain-specific experts and a domain-agnostic query to better utilize source domain knowledge for unseen domain inference.
Findings
Achieves state-of-the-art results on three domain generalization benchmarks.
Demonstrates effective dynamic decoding of source domain knowledge.
Outperforms existing multi-expert frameworks in domain generalization.
Abstract
Optimizing the performance of classifiers on samples from unseen domains remains a challenging problem. While most existing studies on domain generalization focus on learning domain-invariant feature representations, multi-expert frameworks have been proposed as a possible solution and have demonstrated promising performance. However, current multi-expert learning frameworks fail to fully exploit source domain knowledge during inference, resulting in sub-optimal performance. In this work, we propose to adapt Transformers for the purpose of dynamically decoding source domain knowledge for domain generalization. Specifically, we build one domain-specific local expert per source domain and one domain-agnostic feature branch as query. A Transformer encoder encodes all domain-specific features as source domain knowledge in memory. In the Transformer decoder, the domain-agnostic query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and ELM
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Label Smoothing · Softmax · Dense Connections · Absolute Position Encodings · Residual Connection
