Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions
Avrile Floro, Tamara Dhorasoo (UPHF), Soline Pellez (UPHF), Nils Holzenberger

TL;DR
This paper investigates how computational models detect implicit legal citations in French court decisions, highlighting the challenges posed by expert disagreement and proposing ranking-based solutions to improve detection accuracy.
Contribution
It introduces a benchmark dataset for implicit legal citation detection and demonstrates that ensemble and ranking methods can enhance model performance despite expert disagreement.
Findings
Expert disagreement correlates with model failures.
Supervised ensemble achieves 70% F1 score.
Unsupervised ranking improves precision to 76% at top 200.
Abstract
Computational methods applied to legal scholarship hold the promise of analyzing law at scale. We start from a simple question: how often do courts implicitly apply statutory rules? This requires distinguishing legal reasoning from semantic similarity. We focus on implicit citation of the French Civil Code in first-instance court decisions and introduce a benchmark of 1,015 passage-article pairs annotated by three legal experts. We show that expert disagreement predicts model failures. Inter-annotator agreement is moderate ( = 0.33) with 43% of disagreements involving the boundary between factual description and legal reasoning. Our supervised ensemble achieves F1 = 0.70 (77% accuracy), but this figure conceals an asymmetry: 68% of false positives fall on the 33% of cases where the annotators disagreed. Despite these limits, reframing the task as top-k ranking and leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Topic Modeling
