Legal Experts Disagree With Rationale Extraction Techniques for Explaining ECtHR Case Outcome Classification

Mahammad Namazov; Tom\'a\v{s} Koref; Ivan Habernal

arXiv:2601.12419·cs.CL·April 8, 2026

Legal Experts Disagree With Rationale Extraction Techniques for Explaining ECtHR Case Outcome Classification

Mahammad Namazov, Tom\'a\v{s} Koref, Ivan Habernal

PDF

1 Repo

TL;DR

This paper evaluates the effectiveness of rationale extraction techniques for explaining legal outcome predictions in ECtHR cases, revealing discrepancies between model explanations and legal expert judgments.

Contribution

It introduces a new ECtHR dataset, compares interpretability methods, and highlights differences between model rationales and expert reasoning.

Findings

01

Models' reasons differ from legal experts' judgments.

02

Existing explanation techniques may lack plausibility in legal contexts.

03

The source code for experiments is publicly available.

Abstract

Interpretability is critical for applications of large language models (LLMs) in the legal domain, where trust and transparency are essential. A central NLP task in this setting is legal outcome prediction, where models forecast whether a court will find a violation of a given right. We study this task on decisions from the European Court of Human Rights (ECtHR), introducing a new ECtHR dataset with carefully curated positive (violation) and negative (non-violation) cases. Existing works propose both task-specific approaches and model-agnostic techniques to explain downstream performance, but it remains unclear which techniques best explain legal outcome prediction. To address this, we propose a comparative analysis framework for model-agnostic interpretability methods. We focus on two rationale extraction techniques that justify model outputs with concise, human-interpretable text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

trusthlt/IntEval
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.