LegalDrill: Diagnosis-Driven Synthesis for Legal Reasoning in Small Language Models
Tianchun Li, Haochen Liu, Vishwa Pardeshi, Xingchen Wang, Tianci Liu, Huijun Zhao, Wei Fan, Jing Gao

TL;DR
LegalDrill is a novel framework that enhances small language models' legal reasoning by synthesizing and refining reasoning trajectories through a teacher-student approach, improving performance without requiring extensive expert annotations.
Contribution
It introduces a diagnosis-driven synthesis method that extracts and refines reasoning data from a capable teacher to improve small language models' legal reasoning abilities.
Findings
LegalDrill significantly improves SLM performance on legal benchmarks.
The approach reduces dependence on expensive expert annotations.
It enables scalable training of legal reasoning systems.
Abstract
Small language models (SLMs) are promising for real-world deployment due to their efficiency and low operational cost. However, their limited capacity struggles with high-stakes legal reasoning tasks that require coherent statute interpretation and logically consistent deduction. Furthermore, training SLMs for such tasks demands high-quality, concise reasoning trajectories, which are prohibitively expensive to manually collect and difficult to curate via standard rejection sampling, lacking granularity beyond final verdicts. To address these challenges, we propose {LegalDrill}, a diagnosis-driven synthesis framework that extracts and iteratively refines reasoning trajectories from a capable teacher via fine-grained prompting, then a self-reflective verification is employed to adaptively select the most effective data for the SLM student. The resulting data empower SLM training through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
