MathBERT: A Pre-Trained Model for Mathematical Formula Understanding

Shuai Peng; Ke Yuan; Liangcai Gao; Zhi Tang

arXiv:2105.00377·cs.CL·May 4, 2021·53 cites

MathBERT: A Pre-Trained Model for Mathematical Formula Understanding

Shuai Peng, Ke Yuan, Liangcai Gao, Zhi Tang

PDF

Open Access

TL;DR

MathBERT is a novel pre-trained model designed specifically for understanding mathematical formulas by capturing their structural and semantic features, significantly improving performance on related NLP tasks.

Contribution

This paper introduces MathBERT, the first pre-trained model tailored for mathematical formula understanding, incorporating structural features via a new pre-training task.

Findings

01

MathBERT outperforms existing methods on retrieval, classification, and generation tasks.

02

It effectively captures semantic structural information of formulas.

03

Demonstrates significant improvements across multiple downstream tasks.

Abstract

Large-scale pre-trained models like BERT, have obtained a great success in various Natural Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math-related tasks. Current pre-trained models neglect the structural features and the semantic correspondence between formula and its context. To address these issues, we propose a novel pre-trained model, namely \textbf{MathBERT}, which is jointly trained with mathematical formulas and their corresponding contexts. In addition, in order to further capture the semantic-level structural features of formulas, a new pre-training task is designed to predict the masked formula substructures extracted from the Operator Tree (OPT), which is the semantic structural representation of formulas. We conduct various experiments on three downstream tasks to evaluate the performance of MathBERT, including mathematical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematics, Computing, and Information Processing · Computational Physics and Python Applications · Educational Assessment and Pedagogy

MethodsMulti-Head Attention · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Dense Connections · Attention Is All You Need · Residual Connection · Attention Dropout · Adam · Weight Decay