MEL: Legal Spanish Language Model
David Betancur S\'anchez, Nuria Aldama Garc\'ia, \'Alvaro Barbero, Jim\'enez, Marta Guerrero Nieto, Patricia Mars\`a Morales, Nicol\'as Serrano, Salas, Carlos Garc\'ia Hern\'an, Pablo Haya Coll, Elena Montiel Ponsoda,, Pablo Calleja Ib\'a\~nez

TL;DR
This paper introduces MEL, a specialized legal Spanish language model based on XLM-RoBERTa-large, fine-tuned on Spanish legal texts, demonstrating significant improvements in understanding legal language and performing well on various NLP tasks.
Contribution
The paper presents the development and evaluation of MEL, a novel legal Spanish language model fine-tuned on domain-specific legal texts, enhancing NLP performance in this underrepresented language domain.
Findings
MEL outperforms baseline models on legal Spanish benchmarks.
The model achieves top results in multiple NLP legal tasks.
Case studies show effective application to real legal texts.
Abstract
Legal texts, characterized by complex and specialized terminology, present a significant challenge for Language Models. Adding an underrepresented language, such as Spanish, to the mix makes it even more challenging. While pre-trained models like XLM-RoBERTa have shown capabilities in handling multilingual corpora, their performance on domain specific documents remains underexplored. This paper presents the development and evaluation of MEL, a legal language model based on XLM-RoBERTa-large, fine-tuned on legal documents such as BOE (Bolet\'in Oficial del Estado, the Spanish oficial report of laws) and congress texts. We detail the data collection, processing, training, and evaluation processes. Evaluation benchmarks show a significant improvement over baseline models in understanding the legal Spanish language. We also present case studies demonstrating the model's application to new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · linguistics and terminology studies
