SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
Pierre Colombo, Telmo Pires, Malik Boudiaf, Rui Melo, Dominic Culver,, Sofia Morgado, Etienne Malaboeuf, Gabriel Hautreux, Johanne Charpentier,, Michael Desa

TL;DR
This paper presents SaulLM-54B and SaulLM-141B, large legal domain-specific language models developed through large-scale domain adaptation, achieving state-of-the-art performance on legal benchmarks and providing resources for future research.
Contribution
Introduction of two large-scale legal domain language models using novel domain adaptation strategies and synthetic data, surpassing previous open-source models in legal tasks.
Findings
Achieved state-of-the-art performance on LegalBench-Instruct.
Effectively utilized synthetic data for legal interpretation.
Provided open-source models for legal NLP research.
Abstract
In this paper, we introduce SaulLM-54B and SaulLM-141B, two large language models (LLMs) tailored for the legal sector. These models, which feature architectures of 54 billion and 141 billion parameters, respectively, are based on the Mixtral architecture. The development of SaulLM-54B and SaulLM-141B is guided by large-scale domain adaptation, divided into three strategies: (1) the exploitation of continued pretraining involving a base corpus that includes over 540 billion of legal tokens, (2) the implementation of a specialized legal instruction-following protocol, and (3) the alignment of model outputs with human preferences in legal interpretations. The integration of synthetically generated data in the second and third steps enhances the models' capabilities in interpreting and processing legal texts, effectively reaching state-of-the-art performance and outperforming previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis
MethodsBalanced Selection
