Less is More for Long Document Summary Evaluation by LLMs

Yunshu Wu; Hayate Iso; Pouya Pezeshkpour; Nikita Bhutani; Estevam; Hruschka

arXiv:2309.07382·cs.CL·January 19, 2024·2 cites

Less is More for Long Document Summary Evaluation by LLMs

Yunshu Wu, Hayate Iso, Pouya Pezeshkpour, Nikita Bhutani, Estevam, Hruschka

PDF

Open Access 1 Repo

TL;DR

This paper proposes an Extract-then-Evaluate method for long document summary evaluation using LLMs, reducing costs and improving correlation with human judgments by focusing on key sentences.

Contribution

It introduces a novel extraction-based evaluation approach that addresses computational costs and the Lost-in-the-Middle problem in long document summaries.

Findings

01

Significantly reduces evaluation costs

02

Achieves higher correlation with human evaluations

03

Provides practical guidelines for document length and extraction methods

Abstract

Large Language Models (LLMs) have shown promising performance in summary evaluation tasks, yet they face challenges such as high computational costs and the Lost-in-the-Middle problem where important information in the middle of long documents is often overlooked. To address these issues, this paper introduces a novel approach, Extract-then-Evaluate, which involves extracting key sentences from a long source document and then evaluating the summary by prompting LLMs. The results reveal that the proposed method not only significantly reduces evaluation costs but also exhibits a higher correlation with human evaluations. Furthermore, we provide practical recommendations for optimal document length and sentence extraction methods, contributing to the development of cost-effective yet more accurate methods for LLM-based text generation evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

megagonlabs/llm-longeval
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques