The Right Model for the Job: An Evaluation of Legal Multi-Label Classification Baselines
Martina Forster, Claudia Schulz, Prudhvi Nokku, Melicaalsadat, Mirsafian, Jaykumar Kasundra, Stavroula Skylaki

TL;DR
This paper evaluates various multi-label classification methods on legal datasets, highlighting the effectiveness of DistilRoBERTa, LegalBERT, T5, and CrossEncoder for legal document classification tasks.
Contribution
It provides a comprehensive comparison of traditional and Transformer-based MLC methods in the legal domain, analyzing their performance relative to dataset properties.
Findings
DistilRoBERTa and LegalBERT perform consistently well.
T5 offers comparable performance with generative advantages.
CrossEncoder can improve macro-F1 scores but at higher computational costs.
Abstract
Multi-Label Classification (MLC) is a common task in the legal domain, where more than one label may be assigned to a legal document. A wide range of methods can be applied, ranging from traditional ML approaches to the latest Transformer-based architectures. In this work, we perform an evaluation of different MLC methods using two public legal datasets, POSTURE50K and EURLEX57K. By varying the amount of training data and the number of labels, we explore the comparative advantage offered by different approaches in relation to the dataset properties. Our findings highlight DistilRoBERTa and LegalBERT as performing consistently well in legal MLC with reasonable computational demands. T5 also demonstrates comparable performance while offering advantages as a generative model in the presence of changing label sets. Finally, we show that the CrossEncoder exhibits potential for notable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
MethodsGated Linear Unit · Multi-Head Attention · Attention Is All You Need · Residual Connection · Dropout · Linear Layer · Inverse Square Root Schedule · Byte Pair Encoding · Adafactor · Softmax
