A Combinatorial Identities Benchmark for Theorem Proving via Automated Theorem Generation
Beibei Xiong, Hangyu Lv, Haojia Shan, Jianlin Wang, Zhengfeng Yang,, Lihong Zhi

TL;DR
This paper introduces LeanComb, a new benchmark for formalizing combinatorial identities in Lean, and develops ATG4CI, an automated theorem generator that enhances theorem proving capabilities in combinatorics using LLMs and reinforcement learning.
Contribution
It presents the first formalized combinatorial identities benchmark in Lean and a novel automated theorem generator combining LLMs with reinforcement learning for tactic prediction.
Findings
Generated 260K theorems with formal proofs in Lean.
Models trained on this dataset improve tactic effectiveness.
Enhanced models achieve higher success rates in ATP for combinatorial identities.
Abstract
Large language models (LLMs) have significantly advanced formal theorem proving, yet the scarcity of high-quality training data constrains their capabilities in complex mathematical domains. Combinatorics, a cornerstone of mathematics, provides essential tools for analyzing discrete structures and solving optimization problems. However, its inherent complexity makes it particularly challenging for automated theorem proving (ATP) for combinatorial identities. To address this, we manually construct LeanComb, combinatorial identities benchmark in Lean, which is, to our knowledge, the first formalized theorem proving benchmark built for combinatorial identities. We develop an Automated Theorem Generator for Combinatorial Identities, ATG4CI, which combines candidate tactics suggested by a self-improving large language model with a Reinforcement Learning Tree Search approach for tactic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems
