A Learning Method for Symbolic Systems Using Large Language Models
Jian Fang, Yixun Yao, Yingfei Xiong

TL;DR
This paper introduces LLM2Ltac, a novel method that uses large language models to synthesize symbolic tactics from proof data, enhancing traditional provers' efficiency and effectiveness.
Contribution
It presents the first approach to leverage LLMs as tactic synthesizers for symbolic provers, improving proof automation without high runtime costs.
Findings
Mine tactics from 11,725 theorems in standard libraries.
Improve CoqHammer to prove 23.87% more theorems.
Increase overall proved theorems by 9.90% when combined with Claude Code.
Abstract
Automated theorem proving is essential for the formal verification of safety-critical systems. As the corpus of formal proofs grows, a natural paradigm is to learn from existing proofs. However, current learning-based approaches predominantly train Large Language Models (LLMs) as end-to-end provers, which yields resource-intensive, opaque systems. Conversely, while traditional symbolic provers are computationally efficient, how to automatically improve these solvers from data remains an open challenge. This paper bridges this gap by proposing LLM2Ltac, the first approach that leverages the reasoning power of LLMs not as end-to-end provers, but as intelligent synthesizers to mine purely symbolic tactics from data. Given a corpus of formal proofs, LLM2Ltac asks an LLM to identify latent proof strategies and formalize them into reusable tactics. These tactics are verified for validity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
