A Learning Method for Symbolic Systems Using Large Language Models

Jian Fang; Yixun Yao; Yingfei Xiong

arXiv:2605.08694·cs.SE·May 12, 2026

A Learning Method for Symbolic Systems Using Large Language Models

Jian Fang, Yixun Yao, Yingfei Xiong

PDF

TL;DR

This paper introduces LLM2Ltac, a novel method that uses large language models to synthesize symbolic tactics from proof data, enhancing traditional provers' efficiency and effectiveness.

Contribution

It presents the first approach to leverage LLMs as tactic synthesizers for symbolic provers, improving proof automation without high runtime costs.

Findings

01

Mine tactics from 11,725 theorems in standard libraries.

02

Improve CoqHammer to prove 23.87% more theorems.

03

Increase overall proved theorems by 9.90% when combined with Claude Code.

Abstract

Automated theorem proving is essential for the formal verification of safety-critical systems. As the corpus of formal proofs grows, a natural paradigm is to learn from existing proofs. However, current learning-based approaches predominantly train Large Language Models (LLMs) as end-to-end provers, which yields resource-intensive, opaque systems. Conversely, while traditional symbolic provers are computationally efficient, how to automatically improve these solvers from data remains an open challenge. This paper bridges this gap by proposing LLM2Ltac, the first approach that leverages the reasoning power of LLMs not as end-to-end provers, but as intelligent synthesizers to mine purely symbolic tactics from data. Given a corpus of formal proofs, LLM2Ltac asks an LLM to identify latent proof strategies and formalize them into reusable tactics. These tactics are verified for validity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.