CREAM: Comparison-Based Reference-Free ELO-Ranked Automatic Evaluation   for Meeting Summarization

Ziwei Gong; Lin Ai; Harshsaiprasad Deshpande; Alexander Johnson; Emmy; Phung; Zehui Wu; Ahmad Emami; Julia Hirschberg

arXiv:2409.10883·cs.CL·September 18, 2024

CREAM: Comparison-Based Reference-Free ELO-Ranked Automatic Evaluation for Meeting Summarization

Ziwei Gong, Lin Ai, Harshsaiprasad Deshpande, Alexander Johnson, Emmy, Phung, Zehui Wu, Ahmad Emami, Julia Hirschberg

PDF

Open Access

TL;DR

CREAM is a novel reference-free evaluation framework for meeting summarization that uses Elo ranking and key facts alignment to assess summary quality without needing reference summaries.

Contribution

It introduces a new evaluation method combining chain-of-thought reasoning and key facts alignment with Elo ranking for complex meeting summaries.

Findings

01

Effective in evaluating long-context and dialogue-based summaries

02

Outperforms existing reference-free evaluation methods

03

Provides robust comparison across models and prompts

Abstract

Large Language Models (LLMs) have spurred interest in automatic evaluation methods for summarization, offering a faster, more cost-effective alternative to human evaluation. However, existing methods often fall short when applied to complex tasks like long-context summarizations and dialogue-based meeting summarizations. In this paper, we introduce CREAM (Comparison-Based Reference-Free Elo-Ranked Automatic Evaluation for Meeting Summarization), a novel framework that addresses the unique challenges of evaluating meeting summaries. CREAM leverages a combination of chain-of-thought reasoning and key facts alignment to assess conciseness and completeness of model-generated summaries without requiring reference. By employing an ELO ranking system, our approach provides a robust mechanism for comparing the quality of different models or prompt configurations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques