Attribution analysis of legal language as used by LLM
Richard K. Belew

TL;DR
This study analyzes how legal language influences LLM performance using attribution techniques, revealing that tokenizer differences significantly affect model behavior and classification accuracy in legal tasks.
Contribution
It introduces an attribution-based approach to understand legal language processing in LLMs and highlights the impact of tokenization differences on model performance.
Findings
Tokenizer differences explain most performance variations.
Attribution techniques reveal model decision reasons.
Legal tokens can be identified through frequency and stop word analysis.
Abstract
Three publicly-available LLM specifically designed for legal tasks have been implemented and shown that classification accuracy can benefit from training over legal corpora, but why and how? Here we use two publicly-available legal datasets, a simpler binary classification task of ``overruling'' texts, and a more elaborate multiple choice task identifying ``holding'' judicial decisions. We report on experiments contrasting the legal LLM and a generic BERT model for comparison, against both datasets. We use integrated gradient attribution techniques to impute ``causes'' of variation in the models' perfomance, and characterize them in terms of the tokenizations each use. We find that while all models can correctly classify some test examples from the casehold task, other examples can only be identified by only one, model, and attribution can be used to highlight the reasons for this. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicslinguistics and terminology studies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Softmax · Linear Warmup With Linear Decay · Adam · Residual Connection · Dropout · Linear Layer · Dense Connections
