Demystifying LLM-as-a-Judge: Analytically Tractable Model for Inference-Time Scaling
Indranil Halder, Cengiz Pehlevan

TL;DR
This paper introduces an analytically tractable model for understanding inference-time scaling in large language models, revealing how sampling strategies and reward misspecification affect generalization error and optimal inference-time computation.
Contribution
The paper provides a theoretical framework for inference-time scaling in LLMs using Bayesian linear regression, analyzing the effects of sampling and reward misspecification on generalization.
Findings
Generalization error decreases as 1/k^2 with optimal reward and sampling temperature.
Reward misspecification can lead to a finite optimal number of inference samples.
Inference-time compute advantage diminishes with increasing task difficulty.
Abstract
Recent developments in large language models have shown advantages in reallocating a notable share of computational resource from training time to inference time. However, the principles behind inference time scaling are not well understood. In this paper, we introduce an analytically tractable model of inference-time scaling: Bayesian linear regression with a reward-weighted sampler, where the reward is determined from a linear model, modeling LLM-as-a-judge scenario. We study this problem in the high-dimensional regime, where the deterministic equivalents dictate a closed-form expression for the posterior predictive mean and variance. We analyze the generalization error when training data are sampled from a teacher model. We draw inference-time samples and select via softmax at a temperature applied to a quadratic reward. When the reward is not too different from the teacher, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science
