Is In-Context Learning in Large Language Models Bayesian? A Martingale   Perspective

Fabian Falck; Ziyu Wang; Chris Holmes

arXiv:2406.00793·stat.ML·June 4, 2024·2 cites

Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective

Fabian Falck, Ziyu Wang, Chris Holmes

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether in-context learning in large language models behaves like Bayesian inference by analyzing the martingale property, providing theoretical checks and experimental evidence that challenge this hypothesis.

Contribution

It introduces a martingale-based framework to evaluate the Bayesian nature of ICL and presents empirical tests that reveal deviations from Bayesian behavior in LLMs.

Findings

01

Violations of the martingale property in experiments

02

Uncertainty does not decrease as expected with more data

03

ICL does not fully exhibit Bayesian scaling behavior

Abstract

In-context learning (ICL) has emerged as a particularly remarkable characteristic of Large Language Models (LLM): given a pretrained LLM and an observed dataset, LLMs can make predictions for new data points from the same distribution without fine-tuning. Numerous works have postulated ICL as approximately Bayesian inference, rendering this a natural hypothesis. In this work, we analyse this hypothesis from a new angle through the martingale property, a fundamental requirement of a Bayesian learning system for exchangeable data. We show that the martingale property is a necessary condition for unambiguous predictions in such scenarios, and enables a principled, decomposed notion of uncertainty vital in trustworthy, safety-critical systems. We derive actionable checks with corresponding theory and test statistics which must hold if the martingale property is satisfied. We also examine if…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

meta-inf/bayes_icl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling