Developing Corpus-based Translation Methods between Informal and Formal Mathematics: Project Description
Cezary Kaliszyk, Josef Urban, Jiri Vyskocil, Herman Geuvers

TL;DR
This paper presents a project aimed at creating annotated corpora and developing translation methods to convert informal mathematical language into formal representations, integrating automated reasoning for improved semantic understanding.
Contribution
It introduces a comprehensive approach combining corpus development, translation methods, and automated reasoning for formalizing informal mathematics.
Findings
Initial corpora have been assembled for informal and formal mathematics.
Preliminary experiments demonstrate the feasibility of semi-automated translation.
Integration with automated reasoning shows promise for semantic accuracy.
Abstract
The goal of this project is to (i) accumulate annotated informal/formal mathematical corpora suitable for training semi-automated translation between informal and formal mathematics by statistical machine-translation methods, (ii) to develop such methods oriented at the formalization task, and in particular (iii) to combine such methods with learning-assisted automated reasoning that will serve as a strong semantic component. We describe these ideas, the initial set of corpora, and some initial experiments done over them.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Formal Methods in Verification · Software Engineering Research
