Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context

Faris Chaudhry; Siddhant Gadkari

arXiv:2603.10573·cs.LG·March 13, 2026

Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context

Faris Chaudhry, Siddhant Gadkari

PDF

Open Access

TL;DR

This paper investigates how Transformers perform in-context learning by approximating likelihood-ratio tests, revealing they adaptively construct task-specific statistical estimators rather than relying on fixed heuristics.

Contribution

It introduces a statistical decision-theoretic framework to analyze Transformers, showing they approximate Bayes-optimal statistics and adapt their computations based on task complexity.

Findings

01

Transformers approximate Bayesian sufficient statistics in in-context learning.

02

Models adapt decision thresholds dynamically, not relying on fixed heuristics.

03

Performance matches ideal estimators in nonlinear regimes.

Abstract

In-context learning (ICL) allows Transformers to adapt to novel tasks without weight updates, yet the underlying algorithms remain poorly understood. We adopt a statistical decision-theoretic perspective by investigating simple binary hypothesis testing, where the optimal policy is determined by the likelihood-ratio test. Notably, this setup provides a mathematically rigorous setting for mechanistic interpretability where the target algorithmic ground truth is known. By training Transformers on tasks requiring distinct geometries (linear shifted means vs. nonlinear variance estimation), we demonstrate that the models approximate the Bayes-optimal sufficient statistics from context up to some monotonic transformation, matching the performance of an ideal oracle estimator in nonlinear regimes. Leveraging this analytical ground truth, mechanistic analysis via logit lens and circuit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference