Generative Adapter: Contextualizing Language Models in Parameters with A   Single Forward Pass

Tong Chen; Hao Fang; Patrick Xia; Xiaodong Liu; Benjamin Van Durme,; Luke Zettlemoyer; Jianfeng Gao; Hao Cheng

arXiv:2411.05877·cs.LG·November 12, 2024

Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass

Tong Chen, Hao Fang, Patrick Xia, Xiaodong Liu, Benjamin Van Durme,, Luke Zettlemoyer, Jianfeng Gao, Hao Cheng

PDF

Open Access

TL;DR

GenerativeAdapter is a novel method that efficiently adapts large language models to new contexts by generating low-rank adapters through self-supervised learning, reducing inference overhead without fine-tuning.

Contribution

It introduces a generative approach to create low-rank adapters for language models, enabling quick adaptation to new tasks or domains without additional fine-tuning.

Findings

01

Significantly improves knowledge injection into LMs, with 63.5% F1 score increase in StreamingQA.

02

Outperforms fine-tuning in accuracy on multiple tasks, achieving 44.9% average in MetaICL.

03

Reduces computation and memory costs by 4x compared to full conversation prompting.

Abstract

Large language models (LMs) are typically adapted to improve performance on new contexts (\eg text prompts that define new tasks or domains) through fine-tuning or prompting. However, there is an accuracy compute tradeoff -- fine-tuning incurs significant training cost and prompting increases inference overhead. We introduce $G e n er a t i v e A d a pt er$ , an effective and efficient adaptation method that directly maps new contexts to low-rank LM adapters, thereby significantly reducing inference overhead with no need for finetuning. The adapter generator is trained via self-supervised learning, and can be used to adapt a single frozen LM for any new task simply by mapping the associated task or domain context to a new adapter. We apply $G e n er a t i v e A d a pt er$ to two pretrained LMs (Mistral-7B-Instruct and Llama2-7B-Chat) and evaluate the adapted models in three adaption scenarios: knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsBalanced Selection · Adapter