Large Language Models Can Self-Correct with Key Condition Verification

Zhenyu Wu; Qingkai Zeng; Zhihan Zhang; Zhaoxuan Tan; Chao Shen; Meng; Jiang

arXiv:2405.14092·cs.CL·October 4, 2024·1 cites

Large Language Models Can Self-Correct with Key Condition Verification

Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng, Jiang

PDF

Open Access 1 Video

TL;DR

This paper introduces ProCo, a simple verification framework that enhances large language models' reasoning accuracy by iteratively verifying and correcting responses through key condition verification, significantly outperforming previous self-correction methods.

Contribution

ProCo leverages minimal prompting to verify responses by checking key conditions, enabling effective self-correction in LLMs across multiple reasoning tasks.

Findings

01

+6.8% exact match on open-domain QA datasets

02

+14.1% accuracy on arithmetic reasoning datasets

03

+9.6% accuracy on commonsense reasoning

Abstract

Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective verification method can unleash inherent capabilities of the LLMs. That is to mask a key condition in the question, add the current response to construct a verification question, and predict the condition to verify the response. The condition can be an entity in an open-domain question or a numeric value in a math question, which requires minimal effort (via prompting) to identify. We propose an iterative verify-then-correct framework to progressively identify and correct (probably) false responses, named ProCo. We conduct experiments on three reasoning tasks. On average, ProCo, with GPT-3.5-Turbo as the backend LLM,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Large Language Models Can Self-Correct with Key Condition Verification· underline

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Adam · Dropout · Softmax · {Dispute@FaQ-s}How to file a dispute with Expedia?