Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving

Chuxue Cao; Mengze Li; Juntao Dai; Jinluan Yang; Zijian Zhao; Shengyu Zhang; Weijie Shi; Chengzhong Liu; Sirui Han; Yike Guo

arXiv:2506.17104·cs.AI·June 23, 2025

Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving

Chuxue Cao, Mengze Li, Juntao Dai, Jinluan Yang, Zijian Zhao, Shengyu Zhang, Weijie Shi, Chengzhong Liu, Sirui Han, Yike Guo

PDF

Open Access

TL;DR

This paper introduces DREAM, a novel method to improve large language models' complex mathematical reasoning via first-order logic theorem proving, addressing their current limitations in multi-step deductions and proof strategies.

Contribution

DREAM enhances LLMs' mathematical reasoning with a new inference stage, diversification, and error feedback, leading to performance improvements and a curated theorem dataset.

Findings

01

Performance improved by 0.6% to 6.4% with DREAM

02

Curated dataset of 447 theorems in Lean 4 format

03

Addressed multi-step FOL reasoning challenges in LLMs

Abstract

Large language models (LLMs) have shown promising first-order logic (FOL) reasoning capabilities with applications in various areas. However, their effectiveness in complex mathematical reasoning involving multi-step FOL deductions is still under-researched. While LLMs perform competitively on established mathematical reasoning benchmarks, they struggle with multi-step FOL tasks, as demonstrated by Deepseek-Prover-V2-7B's low accuracy (4.2%) on our proposed theorem proving dataset. This issue arises from the limited exploration of diverse proof strategies and the potential for early reasoning mistakes to undermine entire proofs. To address these issues, we propose DREAM, a self-adaptive solution that enhances the Diversity and REAsonability of LLMs' generation strategies. DREAM incorporates an Axiom-Driven Strategy Diversification mechanism to promote varied strategic outcomes and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Mathematics, Computing, and Information Processing · Natural Language Processing Techniques