Masked Measurement Prediction: Learning to Jointly Predict Quantities and Units from Textual Context
Daniel Spokoyny, Ivan Lee, Zhao Jin, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces Masked Measurement Prediction, a new task for models to jointly predict numerical values and units from text, along with a generative model, GeMM, to improve numeracy in language models.
Contribution
The paper proposes a novel task and a generative model for joint number and unit prediction, enhancing numerical reasoning capabilities of language models.
Findings
Traditional models underperform on joint number-unit tasks
GeMM outperforms baselines in measurement prediction
Pretraining with MMP improves numeracy in transformer models
Abstract
Physical measurements constitute a large portion of numbers in academic papers, engineering reports, and web tables. Current benchmarks fall short of properly evaluating numeracy of pretrained language models on measurements, hindering research on developing new methods and applying them to numerical tasks. To that end, we introduce a novel task, Masked Measurement Prediction (MMP), where a model learns to reconstruct a number together with its associated unit given masked text. MMP is useful for both training new numerically informed models as well as evaluating numeracy of existing systems. In order to address this task, we introduce a new Generative Masked Measurement (GeMM) model that jointly learns to predict numbers along with their units. We perform fine-grained analyses comparing our model with various ablations and baselines. We use linear probing of traditional pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
