Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark
Joel Niklaus, Ilias Chalkidis, Matthias St\"urmer

TL;DR
This paper introduces a multilingual, diachronic legal judgment prediction benchmark from Swiss courts, evaluating BERT-based models and analyzing factors affecting prediction accuracy to aid legal AI development.
Contribution
It releases a large, multilingual Swiss legal judgment dataset and benchmarks BERT-based models, including variants overcoming input length limitations, for legal judgment prediction.
Findings
Hierarchical BERT achieves 68-70% Macro-F1-Score in German and French.
Performance varies with canton, year, text length, and legal area.
The dataset and code are publicly available for future research.
Abstract
In many jurisdictions, the excessive workload of courts leads to high delays. Suitable predictive AI models can assist legal professionals in their work, and thus enhance and speed up the process. So far, Legal Judgment Prediction (LJP) datasets have been released in English, French, and Chinese. We publicly release a multilingual (German, French, and Italian), diachronic (2000-2020) corpus of 85K cases from the Federal Supreme Court of Switzerland (FSCS). We evaluate state-of-the-art BERT-based methods including two variants of BERT that overcome the BERT input (text) length limitation (up to 512 tokens). Hierarchical BERT has the best performance (approx. 68-70% Macro-F1-Score in German and French). Furthermore, we study how several factors (canton of origin, year of publication, text length, legal area) affect performance. We release both the benchmark dataset and our code to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · WordPiece · Adam · Attention Dropout · Residual Connection · Weight Decay · Dropout · Dense Connections
