Impacts of Continued Legal Pre-Training and IFT on LLMs' Latent Representations of Human-Defined Legal Concepts
Shaun Ho

TL;DR
This study investigates how continued legal pre-training and instruction fine-tuning affect large language models' focus on human-defined legal concepts, revealing uneven impacts and limited alignment with legal knowledge structures.
Contribution
It provides a detailed analysis of the effects of legal training on LLM attention patterns and their alignment with human legal concepts, highlighting areas for further research.
Findings
Legal training impacts attention unevenly across concepts
Legal representations do not fully align with human-defined structures
Further investigation needed into legal LLM training dynamics
Abstract
This paper aims to offer AI & Law researchers and practitioners a more detailed understanding of whether and how continued pre-training and instruction fine-tuning (IFT) of large language models (LLMs) on legal corpora increases their utilization of human-defined legal concepts when developing global contextual representations of input sequences. We compared three models: Mistral 7B, SaulLM-7B-Base (Mistral 7B with continued pre-training on legal corpora), and SaulLM-7B-Instruct (with further IFT). This preliminary assessment examined 7 distinct text sequences from recent AI & Law literature, each containing a human-defined legal concept. We first compared the proportions of total attention the models allocated to subsets of tokens representing the legal concepts. We then visualized patterns of raw attention score alterations, evaluating whether legal training introduced novel attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law · Law, Economics, and Judicial Systems
MethodsSoftmax · Attention Is All You Need
