Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar
Qingxiang Wang, Chad Brown, Cezary Kaliszyk, Josef Urban

TL;DR
This paper investigates neural machine translation models for converting informal LaTeX-based mathematical statements into formal Mizar language, comparing supervised and unsupervised approaches and introducing a custom type-elaboration mechanism to enhance results.
Contribution
It presents a comprehensive experimental analysis of neural translation models for autoformalization and introduces a novel type-elaboration method to improve translation accuracy.
Findings
Supervised models outperform unsupervised ones in autoformalization tasks.
Data augmentation via custom type-elaboration improves translation quality.
Neural models show promise but still face challenges in formalizing complex mathematics.
Abstract
In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages. To train these models we also prepared four informal-to-formal datasets. We compare and analyze our results according to whether the model is supervised or unsupervised. In order to augment the data available for auto-formalization and improve the results, we develop a custom type-elaboration mechanism and integrate it in the supervised translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
