ECtHR-PCR: A Dataset for Precedent Understanding and Prior Case Retrieval in the European Court of Human Rights
T.Y.S.S Santosh, Rashid Gustav Haddad, Matthias Grabmair

TL;DR
This paper introduces ECtHR-PCR, a large and realistic dataset for precedent case retrieval in the European Court of Human Rights, and evaluates various retrieval models and strategies to improve legal case understanding.
Contribution
It provides a novel dataset that separates facts from arguments and simulates real-world legal retrieval, along with benchmarking different retrieval approaches and analyzing their limitations.
Findings
Difficulty-based negative sampling was ineffective.
Dense retrieval models' performance degrades over time.
Different legal views influence retrieval outcomes.
Abstract
In common law jurisdictions, legal practitioners rely on precedents to construct arguments, in line with the doctrine of \emph{stare decisis}. As the number of cases grow over the years, prior case retrieval (PCR) has garnered significant attention. Besides lacking real-world scale, existing PCR datasets do not simulate a realistic setting, because their queries use complete case documents while only masking references to prior cases. The query is thereby exposed to legal reasoning not yet available when constructing an argument for an undecided case as well as spurious patterns left behind by citation masks, potentially short-circuiting a comprehensive understanding of case facts and legal principles. To address these limitations, we introduce a PCR dataset based on judgements from the European Court of Human Rights (ECtHR), which explicitly separate facts from arguments and exhibit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEuropean and International Law Studies
