Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

Jaap Jumelet; Lisa Bylinina; Willem Zuidema; Jakub Szymanik

arXiv:2407.02136·cs.CL·February 10, 2026

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

Jaap Jumelet, Lisa Bylinina, Willem Zuidema, Jakub Szymanik

PDF

Open Access 1 Repo

TL;DR

This paper investigates how large language models understand and generate adjective order preferences, revealing that their behavior is mainly driven by training data frequencies but also involves generalization and contextual cues.

Contribution

The study demonstrates that LMs' adjective order preferences are primarily based on distributional data, yet they also generalize beyond memorized patterns and utilize contextual cues.

Findings

01

LM predictions align with training data frequencies

02

Models generalize to unseen adjective combinations

03

Contextual cues influence adjective order in output

Abstract

In English and other languages, multiple adjectives in noun phrases follow intricate ordering patterns. These patterns have been widely studied in linguistics and provide a useful test case for assessing how language models (LMs) acquire graded and context-sensitive word order preferences. We ask to what extent adjective order preferences in LMs can be explained by distributional learning alone, and where models exhibit behaviour that goes beyond surface co-occurrence patterns. We find that LM predictions are largely explained by training data frequencies: simple n-gram statistics account for much of their behaviour and closely mirror the preferences learned during training. However, by analysing learning dynamics we reveal that models also generalize robustly to unseen adjective combinations, indicating that their behaviour cannot be reduced to memorization of observed orders alone.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jumelet/lm-adjorder
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training