On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, and Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang, Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

TL;DR
This paper explores the interchangeable transformations among reward modeling, parameter updates, and in-context prompts in large language models, creating a unified framework that guides future research and applications.
Contribution
It introduces a triangular framework unifying three adaptation tools for LLMs, highlighting their interchangeability and potential for diverse applications.
Findings
Six transformation directions form a comprehensive framework.
Unifies existing adaptation methods under a common structure.
Provides a roadmap for future LLM research.
Abstract
Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation directions, each of which facilitates a variety of applications. Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions. We envision our work as a useful roadmap for future research on LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
