Language Models Are Poor Learners of Directional Inference

Tianyi Li; Mohammad Javad Hosseini; Sabine Weber; Mark; Steedman

arXiv:2210.04695·cs.CL·October 17, 2022

Language Models Are Poor Learners of Directional Inference

Tianyi Li, Mohammad Javad Hosseini, Sabine Weber, Mark, Steedman

PDF

Open Access 1 Repo

TL;DR

This paper critically evaluates language models' ability to understand directional inferences, revealing their limitations and introducing a new multilingual benchmark to better assess this aspect of natural language understanding.

Contribution

It highlights the inadequacy of current datasets for testing directional inference and introduces BoOQA, a robust benchmark for evaluating language models on this task.

Findings

01

Language models perform poorly on directional inference tasks.

02

Existing datasets are flawed and can be exploited by artifacts.

03

BoOQA provides a more reliable evaluation framework.

Abstract

We examine LMs' competence of directional predicate entailments by supervised fine-tuning with prompts. Our analysis shows that contrary to their apparent success on standard NLI, LMs show limited ability to learn such directional inference; moreover, existing datasets fail to test directionality, and/or are infested by artefacts that can be learnt as proxy for entailments, yielding over-optimistic results. In response, we present BoOQA (Boolean Open QA), a robust multi-lingual evaluation benchmark for directional predicate entailments, extrinsic to existing training sets. On BoOQA, we establish baselines and show evidence of existing LM-prompting models being incompetent directional entailment learners, in contrast to entailment graphs, however limited by sparsity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

teddy-li/lm-dirctionalinference
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsTest