A Simple Model of Inference Scaling Laws
Noam Levi

TL;DR
This paper introduces a simple statistical model to understand how inference performance improves with repeated attempts in large language models, linking coverage, inference loss, and prompting costs.
Contribution
It proposes a novel inference scaling law based on memorization, connecting coverage and inference loss, validated through experiments and applicable to other scaling laws.
Findings
Coverage improves with multiple inference attempts following a predictable pattern.
Inference loss decreases as a power law with more trials.
Model predictions align with empirical coverage curves in controlled experiments.
Abstract
Neural scaling laws have garnered significant interest due to their ability to predict model performance as a function of increasing parameters, data, and compute. In this work, we propose a simple statistical ansatz based on memorization to study scaling laws in the context of inference, specifically how performance improves with multiple inference attempts. We explore the coverage, or pass@k metric, which measures the chance of success over repeated attempts and provide a motivation for the observed functional form of the inference scaling behavior of the coverage in large language models (LLMs) on reasoning tasks. We then define an "inference loss", which exhibits a power law decay as the number of trials increases, and connect this result with prompting costs. We further test our construction by conducting experiments on a simple generative model, and find that our predictions are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Natural Language Processing Techniques
