Correction and Corruption: A Two-Rate View of Error Flow in LLM Protocols

Fernando Reitich

arXiv:2604.18245·cs.LG·April 28, 2026

Correction and Corruption: A Two-Rate View of Error Flow in LLM Protocols

Fernando Reitich

PDF

TL;DR

This paper introduces a novel measurement interface for auditing individual steps in large language model protocols, enabling better understanding of correction and corruption dynamics across different conditions.

Contribution

It proposes a paired-outcome measurement framework that separates correction from corruption, providing diagnostics for stability, mixture bias, and composition in LLM protocols.

Findings

01

The interface predicts accuracy changes effectively in synthetic and real tasks.

02

Calibration and difficulty proxies improve stability under distribution shifts.

03

Diagnostics identify when multi-step protocols can be reliably composed.

Abstract

Large language models are increasingly deployed as protocols: structured multi-call procedures that spend additional computation to transform a baseline answer into a final one. These protocols are evaluated only by end-to-end accuracy, giving limited insight into when they help, when they hurt, and whether their behavior transfers under distribution shift or composition. We propose a paired-outcome measurement interface for auditing a single protocol step on exact-match tasks. For each instance, the interface records a baseline correctness bit $E_{0} \in {0, 1}$ and a post-step correctness bit $E_{1} \in {0, 1}$ , separating correction ( $E_{0} = 0 \to E_{1} = 1$ ) from corruption ( $E_{0} = 1 \to E_{1} = 0$ ) through two rates: $c = Pr (E_{1} = 1 ∣ E_{0} = 0)$ and $γ = Pr (E_{1} = 0 ∣ E_{0} = 1)$ . These rates predict accuracy changes and define a reusable empirical interface testable across seeds, mixtures, and pipelines.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.