Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Tim Genewein; Li Kevin Wenliang; Jordi Grau-Moya; Anian Ruoss; Laurent Orseau; Marcus Hutter

arXiv:2505.17010·cs.LG·October 21, 2025

Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Tim Genewein, Li Kevin Wenliang, Jordi Grau-Moya, Anian Ruoss, Laurent Orseau, Marcus Hutter

PDF

1 Repo

TL;DR

This paper offers a Bayesian perspective on prompt tuning and in-context learning, revealing fundamental limits and demonstrating how meta-trained neural networks adapt rapidly, with experiments on LSTMs and Transformers validating the theory.

Contribution

It introduces a Bayesian framework for understanding prompt tuning, explaining the behavior of meta-trained models and the effectiveness of soft prefixes in prompt optimization.

Findings

01

Meta-trained networks act as Bayesian predictors with rapid in-context adaptation.

02

Soft prefixes can effectively manipulate activations beyond hard token prompts.

03

Theoretical criteria determine when optimal prompting is feasible.

Abstract

Prompting is one of the main ways to adapt a pretrained model to target tasks. Besides manually constructing prompts, many prompt optimization methods have been proposed in the literature. Method development is mainly empirically driven, with less emphasis on a conceptual understanding of prompting. In this paper we discuss how optimal prompting can be understood through a Bayesian view, which also implies some fundamental limitations of prompting that can only be overcome by tuning weights. The paper explains in detail how meta-trained neural networks behave as Bayesian predictors over the pretraining distribution, whose hallmark feature is rapid in-context adaptation. Optimal prompting can be studied formally as conditioning these Bayesian predictors, yielding criteria for target tasks where optimal prompting is and is not possible. We support the theory with educational experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-deepmind/thunnini
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.