Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language   Models

Vishruth Veerendranath; Vishwa Shah; Kshitish Ghate

arXiv:2404.14355·cs.CL·June 27, 2024

Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language Models

Vishruth Veerendranath, Vishwa Shah, Kshitish Ghate

PDF

Open Access 1 Repo

TL;DR

Pre-Calc introduces a pre-finetuning method enabling smaller language models to effectively learn calculator usage, significantly enhancing their numerical comprehension and reasoning capabilities across various datasets.

Contribution

It presents a novel pre-finetuning objective for encoder-only and encoder-decoder models to improve numerical reasoning by learning calculator use.

Findings

01

Improved performance on numerical reasoning tasks.

02

Effective pre-finetuning for smaller models.

03

Enhanced numerical understanding in downstream tasks.

Abstract

Quantitative and numerical comprehension in language is an important task in many fields like education and finance, but still remains a challenging task for language models. While tool and calculator usage has shown to be helpful to improve mathematical reasoning in large pretrained decoder-only language models, this remains unexplored for smaller language models with encoders. In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively. We pre-train BERT and RoBERTa for discriminative calculator use and Flan-T5 for generative calculator use on the MAWPS, SVAMP, and AsDiv-A datasets, which improves performance on downstream tasks that require numerical understanding. Our code and data are available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calc-cmu/pre-calc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Dense Connections · Residual Connection · Softmax · Adam · Linear Warmup With Linear Decay · Layer Normalization · Attention Dropout