# EQUATE: A Benchmark Evaluation Framework for Quantitative Reasoning in   Natural Language Inference

**Authors:** Abhilasha Ravichander, Aakanksha Naik, Carolyn Rose, Eduard Hovy

arXiv: 1901.03735 · 2019-10-29

## TL;DR

EQUATE is a new benchmark framework designed to evaluate and improve the ability of natural language inference models to perform quantitative reasoning, revealing current models' limitations and establishing a symbolic baseline.

## Contribution

The paper introduces EQUATE, a benchmark framework for quantitative reasoning in textual entailment, and proposes Q-REAS, a symbolic baseline that outperforms existing models on numerical reasoning tasks.

## Key findings

- State-of-the-art NLI models do not outperform a majority-class baseline in quantitative reasoning.
- Q-REAS achieves +24.2% success on numerical reasoning tasks.
- Current models have limited verbal reasoning capabilities.

## Abstract

Quantitative reasoning is a higher-order reasoning skill that any intelligent natural language understanding system can reasonably be expected to handle. We present EQUATE (Evaluating Quantitative Understanding Aptitude in Textual Entailment), a new framework for quantitative reasoning in textual entailment. We benchmark the performance of 9 published NLI models on EQUATE, and find that on average, state-of-the-art methods do not achieve an absolute improvement over a majority-class baseline, suggesting that they do not implicitly learn to reason with quantities. We establish a new baseline Q-REAS that manipulates quantities symbolically. In comparison to the best performing NLI model, it achieves success on numerical reasoning tests (+24.2%), but has limited verbal reasoning capabilities (-8.1%). We hope our evaluation framework will support the development of models of quantitative reasoning in language understanding.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.03735/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1901.03735/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/1901.03735/full.md

---
Source: https://tomesphere.com/paper/1901.03735