Adapters Strike Back
Jan-Martin O. Steitz, Stefan Roth

TL;DR
This paper introduces Adapter+, an improved adapter architecture for transformer models that outperforms previous methods and other adaptation mechanisms, achieving state-of-the-art results on the VTAB benchmark with minimal manual tuning.
Contribution
The paper provides a detailed analysis of adapters, identifies pitfalls, and proposes Adapter+ which surpasses prior adapters and complex adaptation methods in various challenging tasks.
Findings
Adapter+ outperforms previous adapters and adaptation mechanisms.
Adapter+ achieves state-of-the-art accuracy on VTAB benchmark.
Adapter+ requires minimal manual intervention for new scenarios.
Abstract
Adapters provide an efficient and lightweight mechanism for adapting trained transformer models to a variety of different tasks. However, they have often been found to be outperformed by other adaptation mechanisms, including low-rank adaptation. In this paper, we provide an in-depth study of adapters, their internal structure, as well as various implementation choices. We uncover pitfalls for using adapters and suggest a concrete, improved adapter architecture, called Adapter+, that not only outperforms previous adapter implementations but surpasses a number of other, more complex adaptation mechanisms in several challenging settings. Despite this, our suggested adapter is highly robust and, unlike previous work, requires little to no manual intervention when addressing a novel scenario. Adapter+ reaches state-of-the-art average accuracy on the VTAB benchmark, even without a per-task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsAdapter
