Confidence v.s. Critique: A Decomposition of Self-Correction Capability   for LLMs

Zhe Yang; Yichang Zhang; Yudong Wang; Ziyao Xu; Junyang Lin; Zhifang; Sui

arXiv:2412.19513·cs.CL·December 30, 2024

Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs

Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang, Sui

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper decomposes and evaluates the self-correction capabilities of Large Language Models, introducing metrics for confidence and critique, and proposes a data transformation strategy to enhance self-correction performance.

Contribution

It introduces a novel decomposition of LLM self-correction into confidence and critique capabilities, along with new evaluation metrics and an effective data transformation method to improve self-correction.

Findings

01

Different models exhibit distinct self-correction behaviors.

02

There is a trade-off between confidence and critique capabilities.

03

Transforming SFT data improves self-correction performance.

Abstract

Large Language Models (LLMs) can correct their self-generated responses, but a decline in accuracy after self-correction is also witnessed. To have a deeper understanding of self-correction, we endeavor to decompose, evaluate, and analyze the self-correction behaviors of LLMs. By enumerating and analyzing answer correctness before and after self-correction, we decompose the self-correction capability into confidence (being confident to correct answers) and critique (turning wrong answers to correct) capabilities, and propose two metrics from a probabilistic perspective to measure these 2 capabilities, along with another metric for overall self-correction capability evaluation. Based on our decomposition and evaluation metrics, we conduct extensive experiments and draw some empirical conclusions. For example, we find different models can exhibit distinct behaviors: some models are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zhe-Young/SelfCorrectDecompose
noneOfficial

Videos

Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs· underline

Taxonomy

TopicsLegal Systems and Judicial Processes · Taxation and Legal Issues · Business Law and Ethics

MethodsShrink and Fine-Tune