A Probabilistic Inference Scaling Theory for LLM Self-Correction

Zhe Yang; Yichang Zhang; Yudong Wang; Ziyao Xu; Junyang Lin; Zhifang Sui

arXiv:2508.16456·cs.CL·August 25, 2025

A Probabilistic Inference Scaling Theory for LLM Self-Correction

Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang Sui

PDF

1 Video

TL;DR

This paper introduces a probabilistic theory modeling how LLMs improve their accuracy through self-correction over multiple rounds, providing a mathematical framework that predicts accuracy evolution.

Contribution

It presents a novel probabilistic model explaining the dynamics of accuracy improvement in LLM self-correction, validated by experiments across various models and datasets.

Findings

01

Accuracy follows an exponential convergence pattern.

02

The model accurately predicts accuracy after a single self-correction round.

03

Theoretical predictions closely match empirical results.

Abstract

Large Language Models (LLMs) have demonstrated the capability to refine their generated answers through self-correction, enabling continuous performance improvement over multiple rounds. However, the mechanisms underlying how and why accuracy evolves during this iterative process remain unexplored. To fill this gap, we propose a probabilistic theory to model the dynamics of accuracy change and explain the performance improvements observed in multi-round self-correction. Through mathematical derivation, we establish that the accuracy after the $t^{t h}$ round of self-correction is given by: $A c c_{t} = U pp - α^{t} (U pp - A c c_{0}),$ where $A c c_{0}$ denotes the initial accuracy, $U pp$ represents the upper bound of accuracy convergence, and $α$ determines the rate of convergence. Based on our theory, these parameters can be calculated and the predicted accuracy curve then can be obtained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Probabilistic Inference Scaling Theory for LLM Self-Correction· underline