A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search

Austin R. Ellis-Mohr; Anuj K. Nayak; Lav R. Varshney

arXiv:2507.00004·cs.LG·July 11, 2025

A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search

Austin R. Ellis-Mohr, Anuj K. Nayak, Lav R. Varshney

PDF

Open Access

TL;DR

This paper introduces DS3, a framework modeling inference as stochastic traversal over skill graphs, providing analytical insights into inference strategies, resource efficiency, and scaling behaviors of large language models.

Contribution

The paper develops DS3, a novel theoretical framework that unifies various inference strategies and explains their efficiency and scaling patterns in LLMs.

Findings

01

Linear accuracy scaling with logarithmic compute

02

Variation in inference strategies based on task difficulty and model capability

03

Emergent reasoning behavior even when performance plateaus

Abstract

Large language models (LLMs) demand considerable computational, energy, and financial resources during both training and deployment. While scaling laws for training have guided much of the field's recent progress, inference costs now represent a significant and growing component of the overall resource burden, particularly for reasoning-focused models. Existing characterizations of compute-optimality that consider model size, dataset size, and inference tokens in isolation or in fixed combinations risk overlooking more efficient operating points. We introduce directed stochastic skill search (DS3), a general framework that represents inference as stochastic traversal over a learned skill graph. From a simplified yet expressive instantiation, we derive closed-form expressions for task success and compute cost across a wide range of inference strategies -- including chain-of-thought (CoT)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Ethics and Social Impacts of AI