Vision Transformer Adapters for Generalizable Multitask Learning
Deblina Bhattacharjee, Sabine S\"usstrunk, Mathieu Salzmann

TL;DR
This paper presents a multitasking vision transformer adapter framework that efficiently learns generalizable task affinities, enabling zero-shot transfer, domain adaptation, and multi-task learning without retraining for new tasks or domains.
Contribution
It introduces a novel task-adapted attention mechanism within vision transformer adapters that generalizes to unseen tasks and domains without retraining.
Findings
Outperforms existing CNN and transformer-based multitasking methods.
Enables zero-shot task transfer and domain adaptation.
Parameter-efficient and does not require retraining for new tasks.
Abstract
We introduce the first multitasking vision transformer adapters that learn generalizable task affinities which can be applied to novel tasks and domains. Integrated into an off-the-shelf vision transformer backbone, our adapters can simultaneously solve multiple dense vision tasks in a parameter-efficient manner, unlike existing multitasking transformers that are parametrically expensive. In contrast to concurrent methods, we do not require retraining or fine-tuning whenever a new task or domain is added. We introduce a task-adapted attention mechanism within our adapter framework that combines gradient-based task similarities with attention-based ones. The learned task affinities generalize to the following settings: zero-shot task transfer, unsupervised domain adaptation, and generalization without fine-tuning to novel domains. We demonstrate that our approach outperforms not only the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Vision Transformer Adapters for Generalizable Multitask Learning· youtube
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Layer Normalization · Residual Connection · Softmax · Dense Connections · Vision Transformer · Adapter
