Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions

Avrile Floro; Tamara Dhorasoo (UPHF); Soline Pellez (UPHF); Nils Holzenberger

arXiv:2603.22973·cs.AI·March 25, 2026

Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions

Avrile Floro, Tamara Dhorasoo (UPHF), Soline Pellez (UPHF), Nils Holzenberger

PDF

Open Access

TL;DR

This paper investigates how computational models detect implicit legal citations in French court decisions, highlighting the challenges posed by expert disagreement and proposing ranking-based solutions to improve detection accuracy.

Contribution

It introduces a benchmark dataset for implicit legal citation detection and demonstrates that ensemble and ranking methods can enhance model performance despite expert disagreement.

Findings

01

Expert disagreement correlates with model failures.

02

Supervised ensemble achieves 70% F1 score.

03

Unsupervised ranking improves precision to 76% at top 200.

Abstract

Computational methods applied to legal scholarship hold the promise of analyzing law at scale. We start from a simple question: how often do courts implicitly apply statutory rules? This requires distinguishing legal reasoning from semantic similarity. We focus on implicit citation of the French Civil Code in first-instance court decisions and introduce a benchmark of 1,015 passage-article pairs annotated by three legal experts. We show that expert disagreement predicts model failures. Inter-annotator agreement is moderate ( $κ$ = 0.33) with 43% of disagreements involving the boundary between factual description and legal reasoning. Our supervised ensemble achieves F1 = 0.70 (77% accuracy), but this figure conceals an asymmetry: 68% of false positives fall on the 33% of cases where the annotators disagreed. Despite these limits, reframing the task as top-k ranking and leveraging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Topic Modeling