Constructions Are So Difficult That Even Large Language Models Get Them   Right for the Wrong Reasons

Shijia Zhou; Leonie Weissweiler; Taiqi He; Hinrich Sch\"utze; David R.; Mortensen; Lori Levin

arXiv:2403.17760·cs.CL·May 31, 2024·1 cites

Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons

Shijia Zhou, Leonie Weissweiler, Taiqi He, Hinrich Sch\"utze, David R., Mortensen, Lori Levin

PDF

Open Access 1 Repo

TL;DR

This paper reveals that large language models often get complex linguistic constructions wrong for the right reasons, exposing limitations in their understanding of nuanced language features.

Contribution

It introduces a challenging NLI dataset with high lexical overlap and analyzes LLM failures, linking these to linguistic construction complexities.

Findings

01

LLMs fail on the new NLI challenge with high bias.

02

Models struggle to distinguish constructions with similar surface features.

03

LLMs do not adequately capture the meaning of certain linguistic constructions.

Abstract

In this paper, we make a contribution that can be understood from two perspectives: from an NLP perspective, we introduce a small challenge dataset for NLI with large lexical overlap, which minimises the possibility of models discerning entailment solely based on token distinctions, and show that GPT-4 and Llama 2 fail it with strong bias. We then create further challenging sub-tasks in an effort to explain this failure. From a Computational Linguistics perspective, we identify a group of constructions with three classes of adjectives which cannot be distinguished by surface features. This enables us to probe for LLM's understanding of these constructions in various ways, and we find that they fail in a variety of ways to distinguish between them, suggesting that they don't adequately represent their meaning or capture the lexical properties of phrasal heads.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shijiazh/constructions-are-so-difficult
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Multi-Head Attention · Softmax · Dropout