On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept
Guangliang Liu, Haitao Mao, Bochuan Cao, Zhiyu Xue, Xitong Zhang,, Rongrong Wang, Jiliang Tang, Kristen Johnson

TL;DR
This paper investigates the intrinsic self-correction ability of Large Language Models, showing it can be improved through iterative interactions, leading to stable performance by reducing model uncertainty and activating latent concepts.
Contribution
It introduces a mathematical framework and simulation to explain how self-correction converges by reducing uncertainty and activating latent concepts in LLMs.
Findings
Intrinsic self-correction improves with iterations in multi-round QA.
Iterative instructions reduce model uncertainty and calibration error.
Self-correction converges to stable performance through uncertainty reduction.
Abstract
Large Language Models (LLMs) are able to improve their responses when instructed to do so, a capability known as self-correction. When instructions provide only the task's goal without specific details about potential issues in the response, LLMs must rely on their internal knowledge to improve response quality, a process referred to as intrinsic self-correction. The empirical success of intrinsic self-correction is evident in various applications, but how and why it is effective remains unknown. In this paper, we unveil that intrinsic self-correction can be progressively improved, allowing it to approach a converged state. Our findings are verified in: (1) the scenario of multi-round question answering, by comprehensively demonstrating that intrinsic self-correction can progressively introduce performance gains through iterative interactions, ultimately converging to stable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
