MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
Shuai Peng, Ke Yuan, Liangcai Gao, Zhi Tang

TL;DR
MathBERT is a novel pre-trained model designed specifically for understanding mathematical formulas by capturing their structural and semantic features, significantly improving performance on related NLP tasks.
Contribution
This paper introduces MathBERT, the first pre-trained model tailored for mathematical formula understanding, incorporating structural features via a new pre-training task.
Findings
MathBERT outperforms existing methods on retrieval, classification, and generation tasks.
It effectively captures semantic structural information of formulas.
Demonstrates significant improvements across multiple downstream tasks.
Abstract
Large-scale pre-trained models like BERT, have obtained a great success in various Natural Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math-related tasks. Current pre-trained models neglect the structural features and the semantic correspondence between formula and its context. To address these issues, we propose a novel pre-trained model, namely \textbf{MathBERT}, which is jointly trained with mathematical formulas and their corresponding contexts. In addition, in order to further capture the semantic-level structural features of formulas, a new pre-training task is designed to predict the masked formula substructures extracted from the Operator Tree (OPT), which is the semantic structural representation of formulas. We conduct various experiments on three downstream tasks to evaluate the performance of MathBERT, including mathematical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Computational Physics and Python Applications · Educational Assessment and Pedagogy
MethodsMulti-Head Attention · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Dense Connections · Attention Is All You Need · Residual Connection · Attention Dropout · Adam · Weight Decay
