Self-Correction as Feedback Control: Error Dynamics, Stability Thresholds, and Prompt Interventions in LLMs

Aofan Liu; Jingxiang Meng

arXiv:2604.22273·cs.AI·May 5, 2026

Self-Correction as Feedback Control: Error Dynamics, Stability Thresholds, and Prompt Interventions in LLMs

Aofan Liu, Jingxiang Meng

PDF

TL;DR

This paper models iterative self-correction in large language models as a feedback control system, identifying stability thresholds and demonstrating how prompt interventions can prevent performance degradation.

Contribution

It introduces a Markov model for error dynamics, establishes a measurable stability threshold, and empirically validates how prompt interventions improve model accuracy.

Findings

01

A sharp EIR threshold (<0.5%) separates beneficial from harmful self-correction.

02

Prompt interventions can reduce EIR and reverse degradation in GPT-4o-mini.

03

Adaptive self-consistency halts harmful refinement and reveals a two-tier capability structure.

Abstract

Iterative self-correction is increasingly deployed in agentic LLM systems, yet whether repeated refinement improves or degrades performance remains inconsistent across models. We recast self-correction as a closed-loop feedback-control problem in which the same model is both controller and plant, and analyze its error dynamics via a two-state Markov model over {Correct, Incorrect}, parameterized by the Error Introduction Rate (EIR) and Error Correction Rate (ECR). The model yields a directly measurable stability threshold -- iterate only when ECR/EIR > Acc/(1-Acc) -- in which EIR acts as a stability margin and prompting becomes lightweight controller design. Empirically, across 7 models and 3 datasets (GSM8K, MATH, StrategyQA), a sharp near-zero EIR boundary (< 0.5%) cleanly separates beneficial from harmful self-correction: only o3-mini (+3.4 pp), Claude Opus 4.6 (+0.6 pp), and o4-mini…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.