QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization

Shiyue Zhang; David Wan; Arie Cattan; Ayal Klein; Ido Dagan; and Mohit Bansal

arXiv:2412.07096·cs.CL·October 8, 2025

QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization

Shiyue Zhang, David Wan, Arie Cattan, Ayal Klein, Ido Dagan, and Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

QAPyramid introduces a more systematic and fine-grained human evaluation method for text summarization by decomposing reference summaries into QA pairs, improving content selection assessment.

Contribution

It proposes a novel QA-based evaluation framework that enhances the Pyramid protocol with finer granularity and automation, maintaining high agreement without expert annotations.

Findings

01

QAPyramid achieves higher correlation with human judgments.

02

It maintains high inter-annotator agreement.

03

Provides more systematic content evaluation.

Abstract

How to properly conduct human evaluations for text summarization is a longstanding challenge. The Pyramid human evaluation protocol, which assesses content selection by breaking the reference summary into subunits and verifying their presence in the system summary, has been widely adopted. However, it suffers from a lack of systematicity in the definition and granularity of the sub-units. We address these problems by proposing QAPyramid, which decomposes each reference summary into finer-grained question-answer (QA) pairs according to the QA-SRL framework. We collect QA-SRL annotations for reference summaries from CNN/DM and evaluate 10 summarization systems, resulting in 8.9K QA-level annotations. We show that, compared to Pyramid, QAPyramid provides more systematic and fine-grained content selection evaluation while maintaining high inter-annotator agreement without needing expert…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhangshiyue/qapyramid
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques