WinoDict: Probing language models for in-context word acquisition

Julian Martin Eisenschlos; Jeremy R. Cole; Fangyu Liu and; William W. Cohen

arXiv:2209.12153·cs.CL·September 27, 2022

WinoDict: Probing language models for in-context word acquisition

Julian Martin Eisenschlos, Jeremy R. Cole, Fangyu Liu and, William W. Cohen

PDF

Open Access 1 Datasets

TL;DR

This paper introduces WinoDict, a benchmark to evaluate large language models' ability to learn and understand new words in context, highlighting current limitations in their in-context learning capabilities.

Contribution

The paper presents a novel benchmark that tests LLMs' ability to acquire new words during inference using synthetic words and definitions, addressing a key aspect of language change.

Findings

01

LLMs' accuracy drops significantly on the new benchmark.

02

Current models struggle with in-context word learning.

03

Benchmark reveals limitations in models' understanding of novel words.

Abstract

We introduce a new in-context learning paradigm to measure Large Language Models' (LLMs) ability to learn novel words during inference. In particular, we rewrite Winograd-style co-reference resolution problems by replacing the key concept word with a synthetic but plausible word that the model must understand to complete the task. Solving this task requires the model to make use of the dictionary definition of the new word given in the prompt. This benchmark addresses word acquisition, one important aspect of the diachronic degradation known to afflict LLMs. As LLMs are frozen in time at the moment they are trained, they are normally unable to reflect the way language changes over time. We show that the accuracy of LLMs compared to the original Winograd tasks decreases radically in our benchmark, thus identifying a limitation of current models and providing a benchmark to measure future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

tasksource/winodict
dataset· 18 dl
18 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning