On the Transformations across Reward Model, Parameter Update, and   In-Context Prompt

Deng Cai; Huayang Li; Tingchen Fu; Siheng Li; Weiwen Xu; and Shuaiyi Li; Bowen Cao; Zhisong Zhang; Xinting Huang; Leyang; Cui; Yan Wang; Lemao Liu; Taro Watanabe; Shuming Shi

arXiv:2406.16377·cs.CL·June 25, 2024

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, and Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang, Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

PDF

Open Access

TL;DR

This paper explores the interchangeable transformations among reward modeling, parameter updates, and in-context prompts in large language models, creating a unified framework that guides future research and applications.

Contribution

It introduces a triangular framework unifying three adaptation tools for LLMs, highlighting their interchangeability and potential for diverse applications.

Findings

01

Six transformation directions form a comprehensive framework.

02

Unifies existing adaptation methods under a common structure.

03

Provides a roadmap for future LLM research.

Abstract

Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation directions, each of which facilitates a variety of applications. Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions. We envision our work as a useful roadmap for future research on LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications