Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs
Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang, Sui

TL;DR
This paper decomposes and evaluates the self-correction capabilities of Large Language Models, introducing metrics for confidence and critique, and proposes a data transformation strategy to enhance self-correction performance.
Contribution
It introduces a novel decomposition of LLM self-correction into confidence and critique capabilities, along with new evaluation metrics and an effective data transformation method to improve self-correction.
Findings
Different models exhibit distinct self-correction behaviors.
There is a trade-off between confidence and critique capabilities.
Transforming SFT data improves self-correction performance.
Abstract
Large Language Models (LLMs) can correct their self-generated responses, but a decline in accuracy after self-correction is also witnessed. To have a deeper understanding of self-correction, we endeavor to decompose, evaluate, and analyze the self-correction behaviors of LLMs. By enumerating and analyzing answer correctness before and after self-correction, we decompose the self-correction capability into confidence (being confident to correct answers) and critique (turning wrong answers to correct) capabilities, and propose two metrics from a probabilistic perspective to measure these 2 capabilities, along with another metric for overall self-correction capability evaluation. Based on our decomposition and evaluation metrics, we conduct extensive experiments and draw some empirical conclusions. For example, we find different models can exhibit distinct behaviors: some models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLegal Systems and Judicial Processes · Taxation and Legal Issues · Business Law and Ethics
MethodsShrink and Fine-Tune
