Prompt Baking

Aman Bhargava; Cameron Witkowski; Alexander Detkov; Matt Thomson

arXiv:2409.13697·cs.CL·September 24, 2024

Prompt Baking

Aman Bhargava, Cameron Witkowski, Alexander Detkov, Matt Thomson

PDF

Open Access

TL;DR

This paper introduces Prompt Baking, a technique to embed prompts directly into LLM weights, enabling models to retain prompt effects while remaining adaptable to further prompting and re-baking, facilitating iterative self-improvement.

Contribution

Prompt Baking is a novel method that converts prompts into weight updates, allowing LLMs to internalize prompts and improve performance across various tasks without losing reactivity.

Findings

01

Baked prompts improve zero-shot performance on multiple benchmarks.

02

Baked models retain sensitivity to further prompts and re-baking.

03

Re-prompting and re-baking lead to significant performance gains.

Abstract

Two primary ways to change LLM behavior are prompting and weight updates (e.g., fine-tuning). Prompting LLMs is simple and effective, specifying the desired changes explicitly in natural language, whereas weight updates provide more expressive and permanent behavior changes, specified implicitly via training on large datasets. We present a technique for "baking" prompts into the weights of an LLM. Prompt Baking converts a prompt $u$ and initial weights $θ$ to a new set of weights $θ_{u}$ such that new "baked" LLM behaves like the original prompted LLM. Mathematically, we minimize the KL divergence between $P_{θ} (\cdot ∣ u)$ and $P_{θ_{u}} (\cdot)$ , where $P$ is the LLM's probability distribution over token sequences. Across all our experiments, we find prompts can be readily baked into weight updates. Baking chain-of-thought prompts improves zero-shot performance on GSM8K,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · Machine Learning in Healthcare · Topic Modeling

MethodsSparse Evolutionary Training