Reinforced Large Language Model is a formal theorem prover
Zhiling Luo

TL;DR
This paper introduces a reinforcement learning framework to enhance large language models for formal theorem proving, resulting in improved accuracy over traditional fine-tuning methods.
Contribution
It presents a novel reinforcement learning approach that iteratively optimizes pretrained LLMs for theorem proving tasks, outperforming direct fine-tuning.
Findings
Higher accuracy in theorem proving tasks
Effective iterative optimization of LLMs
Reinforcement learning improves formalization performance
Abstract
To take advantage of Large Language Model in theorem formalization and proof, we propose a reinforcement learning framework to iteratively optimize the pretrained LLM by rolling out next tactics and comparing them with the expected ones. The experiment results show that it helps to achieve a higher accuracy compared with directly fine-tuned LLM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling
