Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia,, Zongzhang Zhang, Yang Yu, Bo An

TL;DR
Q-Adapter is a novel method for customizing pre-trained large language models to new human preferences by using residual Q-learning, effectively balancing the retention of original capabilities with adaptation to new preferences.
Contribution
The paper introduces Q-Adapter, a residual Q-learning based approach that enables effective customization of pre-trained LLMs to new human preferences without retraining from scratch.
Findings
Q-Adapter outperforms existing methods in preserving original knowledge.
It effectively learns new preferences from limited data.
Demonstrates superior performance on Llama-3.1 with DSP and HH-RLHF datasets.
Abstract
Large Language Models (LLMs), trained on a large amount of corpus, have demonstrated remarkable abilities. However, it may not be sufficient to directly apply open-source LLMs like Llama to certain real-world scenarios, since most of them are trained for \emph{general} purposes. Thus, the demands for customizing publicly available LLMs emerge, but are currently under-studied. In this work, we consider customizing pre-trained LLMs with new human preferences. Specifically, the LLM should not only meet the new preference but also preserve its original capabilities after customization. Drawing inspiration from the observation that human preference can be expressed as a reward model, we propose to cast LLM customization as optimizing the sum of two reward functions, one of which (denoted as ) was used to pre-train the LLM while the other (denoted as ) characterizes the new human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Advanced Database Systems and Queries
MethodsLLaMA · Q-Learning · Adapter
