BriefMe: A Legal NLP Benchmark for Assisting with Legal Briefs
Jesse Woo, Fateme Hashemi Chaleshtori, Ana Marasovi\'c, Kenneth Marino

TL;DR
BriefMe introduces a legal NLP benchmark with tasks like argument summarization, completion, and case retrieval, highlighting current models' strengths and weaknesses in assisting legal professionals.
Contribution
This work presents a new dataset and benchmark for legal briefs, focusing on tasks that evaluate language models' ability to support legal writing and reasoning.
Findings
Large language models excel at summarization and guided completion.
Models perform poorly on realistic argument completion.
Models struggle with relevant case retrieval.
Abstract
A core part of legal work that has been under-explored in Legal NLP is the writing and editing of legal briefs. This requires not only a thorough understanding of the law of a jurisdiction, from judgments to statutes, but also the ability to make new arguments to try to expand the law in a new direction and make novel and creative arguments that are persuasive to judges. To capture and evaluate these legal skills in language models, we introduce BRIEFME, a new dataset focused on legal briefs. It contains three tasks for language models to assist legal professionals in writing briefs: argument summarization, argument completion, and case retrieval. In this work, we describe the creation of these tasks, analyze them, and show how current models perform. We see that today's large language models (LLMs) are already quite good at the summarization and guided completion tasks, even beating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Topic Modeling · Multi-Agent Systems and Negotiation
