Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in   Natural Language Understanding

Zeming Chen; Qiyue Gao

arXiv:2204.06283·cs.CL·May 5, 2022

Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding

Zeming Chen, Qiyue Gao

PDF

Open Access

TL;DR

This paper introduces Curriculum, a comprehensive benchmark for evaluating language models on 36 linguistic phenomena, aiming to diagnose model capabilities and limitations in natural language understanding.

Contribution

It presents a new broad-coverage NLI benchmark with diverse datasets and an evaluation procedure to assess linguistic reasoning skills in language models.

Findings

01

Curriculum effectively diagnoses model strengths and weaknesses.

02

Existing benchmarks have limitations in covering diverse linguistic phenomena.

03

The benchmark reveals gaps in current models' understanding.

Abstract

In the age of large transformer language models, linguistic evaluation play an important role in diagnosing models' abilities and limitations on natural language understanding. However, current evaluation methods show some significant shortcomings. In particular, they do not provide insight into how well a language model captures distinct linguistic skills essential for language understanding and reasoning. Thus they fail to effectively map out the aspects of language understanding that remain challenging to existing models, which makes it hard to discover potential limitations in models and datasets. In this paper, we introduce Curriculum as a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena. Curriculum contains a collection of datasets that covers 36 types of major linguistic phenomena and an evaluation procedure for diagnosing how well a language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research