Fine-Tuning BERTs for Definition Extraction from Mathematical Text
Lucy Horowitz, Ryan Hathaway

TL;DR
This paper fine-tunes BERT models to automatically identify sentences containing definitions of mathematical terms in LaTeX, demonstrating high accuracy with less computational effort than previous models.
Contribution
It introduces a new approach to definition extraction from mathematical text using fine-tuned BERT models and compares multiple datasets for evaluation.
Findings
Sentence-BERT achieved the best accuracy, recall, and precision.
High-performance models can be efficient for mathematical definition extraction.
Models performed comparably to previous methods with reduced computational cost.
Abstract
In this paper, we fine-tuned three pre-trained BERT models on the task of "definition extraction" from mathematical English written in LaTeX. This is presented as a binary classification problem, where either a sentence contains a definition of a mathematical term or it does not. We used two original data sets, "Chicago" and "TAC," to fine-tune and test these models. We also tested on WFMALL, a dataset presented by Vanetik and Litvak in 2021 and compared the performance of our models to theirs. We found that a high-performance Sentence-BERT transformer model performed best based on overall accuracy, recall, and precision metrics, achieving comparable results to the earlier models with less computational effort.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Handwritten Text Recognition Techniques · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Softmax · Layer Normalization · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Dropout · Adam
