Large Language Models Can Self-Correct with Key Condition Verification
Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng, Jiang

TL;DR
This paper introduces ProCo, a simple verification framework that enhances large language models' reasoning accuracy by iteratively verifying and correcting responses through key condition verification, significantly outperforming previous self-correction methods.
Contribution
ProCo leverages minimal prompting to verify responses by checking key conditions, enabling effective self-correction in LLMs across multiple reasoning tasks.
Findings
+6.8% exact match on open-domain QA datasets
+14.1% accuracy on arithmetic reasoning datasets
+9.6% accuracy on commonsense reasoning
Abstract
Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective verification method can unleash inherent capabilities of the LLMs. That is to mask a key condition in the question, add the current response to construct a verification question, and predict the condition to verify the response. The condition can be an entity in an open-domain question or a numeric value in a math question, which requires minimal effort (via prompting) to identify. We propose an iterative verify-then-correct framework to progressively identify and correct (probably) false responses, named ProCo. We conduct experiments on three reasoning tasks. On average, ProCo, with GPT-3.5-Turbo as the backend LLM,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Adam · Dropout · Softmax · {Dispute@FaQ-s}How to file a dispute with Expedia?
