FormaRL: Enhancing Autoformalization with no Labeled Data
Yanxing Huang, Xinling Jin, Sijie Liang, Peng Li, Yang Liu

TL;DR
FormaRL introduces a reinforcement learning framework for autoformalization that requires minimal unlabeled data, significantly improving accuracy and out-of-distribution performance in formal verification tasks.
Contribution
It presents a novel reinforcement learning approach for autoformalization that leverages syntax and consistency checks, with a curated dataset to facilitate research.
Findings
Increases autoformalization accuracy by 4-6x on key datasets.
Achieves significant improvements in out-of-distribution performance.
Operates effectively with only 859 unlabeled data points.
Abstract
Autoformalization is one of the central tasks in formal verification, while its advancement remains hindered due to the data scarcity and the absence efficient methods. In this work we propose \textbf{FormaRL}, a simple yet efficient reinforcement learning framework for autoformalization which only requires a small amount of unlabeled data. FormaRL integrates syntax check from Lean compiler and consistency check from large language model to calculate the reward, and adopts GRPO algorithm to update the formalizer. We also curated a proof problem dataset from undergraduate-level math materials, named \textbf{uproof}, in the hope to facilitate the exploration of autoformalization and theorem proving in advanced math. Experiments show that FormaRL can increase the pass@1 autoformalization accuracy of Qwen2.5-Coder-7B-Instruct by 4 6x (4.04\% 26.15\% on ProofNet and 2.4\% …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
