Numerical reasoning in machine reading comprehension tasks: are we there   yet?

Hadeel Al-Negheimish; Pranava Madhyastha; Alessandra Russo

arXiv:2109.08207·cs.CL·September 20, 2021

Numerical reasoning in machine reading comprehension tasks: are we there yet?

Hadeel Al-Negheimish, Pranava Madhyastha, Alessandra Russo

PDF

Open Access

TL;DR

This paper critically examines whether current NLP models truly understand numerical reasoning in machine reading comprehension, revealing that standard metrics may not accurately measure true reasoning capabilities.

Contribution

The study provides a controlled analysis of top models, highlighting limitations of existing metrics in assessing genuine numerical reasoning skills.

Findings

01

Models perform well on standard metrics but lack true reasoning ability.

02

Standard benchmarks may overestimate models' understanding of numerical reasoning.

03

Metrics do not effectively differentiate between superficial pattern matching and genuine reasoning.

Abstract

Numerical reasoning based machine reading comprehension is a task that involves reading comprehension along with using arithmetic operations such as addition, subtraction, sorting, and counting. The DROP benchmark (Dua et al., 2019) is a recent dataset that has inspired the design of NLP models aimed at solving this task. The current standings of these models in the DROP leaderboard, over standard metrics, suggest that the models have achieved near-human performance. However, does this mean that these models have learned to reason? In this paper, we present a controlled study on some of the top-performing model architectures for the task of numerical reasoning. Our observations suggest that the standard metrics are incapable of measuring progress towards such tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques