Unsupervised Law Article Mining based on Deep Pre-Trained Language Representation Models with Application to the Italian Civil Code
Andrea Tagarelli, Andrea Simeri

TL;DR
This paper introduces LamBERTa, a deep learning framework based on BERT, for law article retrieval in the Italian civil code, demonstrating superior performance in extreme classification scenarios with limited labeled data.
Contribution
It is the first to adapt BERT for Italian law article prediction, addressing extreme classification, few-shot learning, and unsupervised labeling in legal NLP tasks.
Findings
LamBERTa outperforms existing deep learning classifiers.
Effective in few-shot and multi-label scenarios.
Provides insights into model explainability and interpretability.
Abstract
Modeling law search and retrieval as prediction problems has recently emerged as a predominant approach in law intelligence. Focusing on the law article retrieval task, we present a deep learning framework named LamBERTa, which is designed for civil-law codes, and specifically trained on the Italian civil code. To our knowledge, this is the first study proposing an advanced approach to law article prediction for the Italian legal system based on a BERT (Bidirectional Encoder Representations from Transformers) learning framework, which has recently attracted increased attention among deep learning approaches, showing outstanding effectiveness in several natural language processing and learning tasks. We define LamBERTa models by fine-tuning an Italian pre-trained BERT on the Italian civil code or its portions, for law article retrieval as a classification task. One key aspect of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Attention Dropout · WordPiece · Weight Decay · Softmax · Residual Connection · Adam · Dropout
