Do LLMs Truly Understand When a Precedent Is Overruled?

Li Zhang; Jaromir Savelka; Kevin Ashley

arXiv:2510.20941·cs.CL·January 21, 2026

Do LLMs Truly Understand When a Precedent Is Overruled?

Li Zhang, Jaromir Savelka, Kevin Ashley

PDF

TL;DR

This paper evaluates how well large language models understand overruling relationships in legal cases, revealing limitations like era sensitivity, shallow reasoning, and context-dependent failures, and introduces a new benchmark for realistic legal reasoning assessment.

Contribution

It presents a novel long-context legal reasoning benchmark focused on overruling relationships, highlighting key limitations of current LLMs in complex legal understanding.

Findings

01

Models perform worse on historical cases, indicating temporal bias.

02

Models rely on shallow heuristics rather than deep legal reasoning.

03

Models fail in complex, context-dependent legal reasoning tasks.

Abstract

Large language models (LLMs) with extended context windows show promise for complex legal reasoning tasks, yet their ability to understand long legal documents remains insufficiently evaluated. Developing long-context benchmarks that capture realistic, high-stakes tasks remains a significant challenge in the field, as most existing evaluations rely on simplified synthetic tasks that fail to represent the complexity of real-world document understanding. Overruling relationships are foundational to common-law doctrine and commonly found in judicial opinions. They provide a focused and important testbed for long-document legal understanding that closely resembles what legal professionals actually do. We present an assessment of state-of-the-art LLMs on identifying overruling relationships from U.S. Supreme Court cases using a dataset of 236 case pairs. Our evaluation reveals three critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.