Decompose, Structure, and Repair: A Neuro-Symbolic Framework for Autoformalization via Operator Trees

Xiaoyang Liu; Zineng Dong; Yifan Bai; Yantao Li; Yuntian Liu; Tao Luo

arXiv:2604.19000·cs.LG·April 22, 2026

Decompose, Structure, and Repair: A Neuro-Symbolic Framework for Autoformalization via Operator Trees

Xiaoyang Liu, Zineng Dong, Yifan Bai, Yantao Li, Yuntian Liu, Tao Luo

PDF

TL;DR

This paper presents DSR, a neuro-symbolic framework that decomposes mathematical statements into operator trees for improved autoformalization, achieving state-of-the-art results on a new benchmark.

Contribution

The introduction of DSR, a modular neuro-symbolic pipeline that structures and repairs formalizations using operator trees, advancing autoformalization techniques.

Findings

01

DSR outperforms existing baselines on the PRIME benchmark.

02

Decomposition into operator trees improves error localization and correction.

03

Experimental results show significant accuracy gains with DSR.

Abstract

Statement autoformalization acts as a critical bridge between human mathematics and formal mathematics by translating natural language problems into formal language. While prior works have focused on data synthesis and diverse training paradigms to optimize end-to-end Large Language Models (LLMs), they typically treat formal code as flat sequences, neglecting the hierarchical logic inherent in mathematical statements. In this work, we introduce Decompose, Structure, and Repair (DSR), a neuro-symbolic framework that restructures autoformalization into a modular pipeline. DSR decomposes statements into logical components and maps them to structured operator trees, leveraging this topological blueprint to precisely localize and repair errors via sub-tree refinement. Furthermore, we introduce PRIME, a benchmark of 156 undergraduate and graduate-level theorems selected from canonical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.