Supporting Humans in Evaluating AI Summaries of Legal Depositions
Naghmeh Farzi, Laura Dietz, Dave D. Lewis

TL;DR
This paper introduces a nugget-based approach to help legal professionals evaluate and improve AI-generated summaries of depositions, addressing the critical need for factual accuracy in legal contexts.
Contribution
It adapts nugget-based evaluation methods for end-user support in legal summarization, enabling better comparison and manual enhancement of summaries.
Findings
Nugget-based evaluation aids legal professionals in assessing summary quality.
The prototype supports identifying the better summary between two options.
It assists users in manually improving AI-generated summaries.
Abstract
While large language models (LLMs) are increasingly used to summarize long documents, this trend poses significant challenges in the legal domain, where the factual accuracy of deposition summaries is crucial. Nugget-based methods have been shown to be extremely helpful for the automated evaluation of summarization approaches. In this work, we translate these methods to the user side and explore how nuggets could directly assist end users. Although prior systems have demonstrated the promise of nugget-based evaluation, its potential to support end users remains underexplored. Focusing on the legal domain, we present a prototype that leverages a factual nugget-based approach to support legal professionals in two concrete scenarios: (1) determining which of two summaries is better, and (2) manually improving an automatically generated summary.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Law · Text Readability and Simplification
