StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion

Yutong Wu; Di Huang; Ruosi Wan; Yue Peng; Shijie Shang; Chenrui Cao; Lei Qi; Rui Zhang; Zidong Du; Jie Yan; Xing Hu

arXiv:2508.04440·cs.CL·December 29, 2025

StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion

Yutong Wu, Di Huang, Ruosi Wan, Yue Peng, Shijie Shang, Chenrui Cao, Lei Qi, Rui Zhang, Zidong Du, Jie Yan, Xing Hu

PDF

2 Models 1 Datasets 1 Video

TL;DR

This paper introduces StepFun-Formalizer, a large language model trained with a novel pipeline that combines formal knowledge mastery and reasoning abilities, significantly improving autoformalization accuracy for mathematical statements.

Contribution

It presents ThinkingF, a data synthesis and training pipeline that enhances both formal knowledge and reasoning in LLMs for autoformalization tasks.

Findings

01

Achieved state-of-the-art BEq@1 scores of 40.5% on FormalMATH-Lite.

02

Developed datasets with formal knowledge and reasoning trajectories.

03

Models exhibit strong formal knowledge and reasoning capabilities.

Abstract

Autoformalization aims to translate natural-language mathematical statements into a formal language. While LLMs have accelerated progress in this area, existing methods still suffer from low accuracy. We identify two key abilities for effective autoformalization: comprehensive mastery of formal-language domain knowledge, and reasoning capability of natural language problem understanding and informal-formal alignment. Without the former, a model cannot identify the correct formal objects; without the latter, it struggles to interpret real-world contexts and map them precisely into formal expressions. To address these gaps, we introduce ThinkingF, a data synthesis and training pipeline that improves both abilities. First, we construct two datasets: one by distilling and selecting large-scale examples rich in formal knowledge, and another by generating informal-to-formal reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

stepfun-ai/StepFun-Formalizer-Training
dataset· 165 dl
165 dl

Videos

StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs Through Knowledge-Reasoning Fusion· underline