Language models fail at extended rule following

Tianxiang Dai; Jonathan Fan

arXiv:2605.02028·cs.CL·May 19, 2026

Language models fail at extended rule following

Tianxiang Dai, Jonathan Fan

PDF

TL;DR

Large language models struggle with reliably following extended rules, such as counting, due to finite internal states, highlighting the need for new architectures for autonomous agents.

Contribution

This paper demonstrates the limitations of current language models in rule-following tasks and identifies internal state exhaustion as the core failure mode.

Findings

01

Models fail to accurately count beyond a certain threshold.

02

Failures persist despite increased model size and external tools.

03

Internal states used for counting are finite and get exhausted.

Abstract

Large language models are highly capable of answering difficult questions by retrieving, recombining, and attending to information in long contexts. For agentic tasks, an additional capability is required: the preservation of an exact state while repeatedly applying rules. We find that this reliability is absent across language models. To demonstrate, we query 126 leading model variants with the task of counting a long string of repeated characters, and we find they all cannot accurately count above a model-dependent, syntax-sensitive counting capacity threshold. Failures are abrupt and persist even with increasing model size, inference time computation, and external tool. Mechanistic probing indicates that models use a finite number of internal states to mimic counting as a rule and fail once these states are exhausted. Furthermore, such states are the basis for performing complex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.