LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from Short to Long Contexts and for Implication-Based Retrieval
William Bruno, Dan Roth

TL;DR
LawngNLI is a new long-premise legal inference benchmark that evaluates models' ability to generalize from short to long contexts and supports implication-based retrieval, highlighting the need for domain-specific long-premise datasets.
Contribution
Introduces LawngNLI, a high-accuracy, long-premise legal NLI dataset, and demonstrates its use for benchmarking in-domain generalization and implication-based retrieval.
Findings
Large-scale long-premise datasets are necessary for certain domains.
Models fine-tuned on short premises underperform on long premises.
LawngNLI enables effective evaluation of implication-based retrieval systems.
Abstract
Natural language inference has trended toward studying contexts beyond the sentence level. An important application area is law: past cases often do not foretell how they apply to new situations and implications must be inferred. This paper introduces LawngNLI, constructed from U.S. legal opinions with automatic labels with high human-validated accuracy. Premises are long and multigranular. Experiments show two use cases. First, LawngNLI can benchmark for in-domain generalization from short to long contexts. It has remained unclear if large-scale long-premise NLI datasets actually need to be constructed: near-top performance on long premises could be achievable by fine-tuning using short premises. Without multigranularity, benchmarks cannot distinguish lack of fine-tuning on long premises versus domain shift between short and long datasets. In contrast, our long and short premises share…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Law
MethodsTest
