Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
Yuze Zhao, Tianyun Ji, Wenjun Feng, Zhenya Huang, Qi Liu, Zhiding Liu,, Yixiao Ma, Kai Zhang, Enhong Chen

TL;DR
This paper introduces a novel code reasoning benchmark and a reflective hypothesis decomposition pipeline to improve large language models' reasoning capabilities, achieving significant performance gains and better handling of complex, real-world tasks.
Contribution
The paper proposes a new code reasoning benchmark and a human-inspired hypothesis decomposition pipeline that enhances reasoning accuracy and robustness in large language models.
Findings
LLMs struggle with identifying satisfactory reasoning pathways.
The RHDA pipeline improves reasoning performance up to 3x.
Applying the pipeline to real-world tasks enhances failure handling.
Abstract
The reasoning abilities are one of the most enigmatic and captivating aspects of large language models (LLMs). Numerous studies are dedicated to exploring and expanding the boundaries of this reasoning capability. However, tasks that embody both reasoning and recall characteristics are often overlooked. In this paper, we introduce such a novel task, code reasoning, to provide a new perspective for the reasoning abilities of LLMs. We summarize three meta-benchmarks based on established forms of logical reasoning, and instantiate these into eight specific benchmark tasks. Our testing on these benchmarks reveals that LLMs continue to struggle with identifying satisfactory reasoning pathways. Additionally, we present a new pathway exploration pipeline inspired by human intricate problem-solving methods. This Reflective Hypothesis Decomposition and Amendment (RHDA) pipeline consists of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Issues in Education · Software Engineering Research
