SUMIE: A Synthetic Benchmark for Incremental Entity Summarization

Eunjeong Hwang; Yichao Zhou; Beliz Gunel; James Bradley Wendt and; Sandeep Tata

arXiv:2406.05079·cs.CL·June 10, 2024

SUMIE: A Synthetic Benchmark for Incremental Entity Summarization

Eunjeong Hwang, Yichao Zhou, Beliz Gunel, James Bradley Wendt and, Sandeep Tata

PDF

Open Access

TL;DR

SUMIE is a synthetic benchmark dataset designed to evaluate how well language models can incrementally update entity summaries, revealing current limitations and guiding future improvements.

Contribution

We introduce SUMIE, a novel synthetic dataset that captures real-world complexities for incremental entity summarization, and provide an evaluation framework for LLMs on this task.

Findings

01

State-of-the-art LLMs struggle with IES, achieving F1 scores below 80.4%.

02

The dataset exposes issues like incorrect entity association and incomplete info.

03

High alignment (>96%) confirms dataset quality.

Abstract

No existing dataset adequately tests how well language models can incrementally update entity summaries - a crucial ability as these models rapidly advance. The Incremental Entity Summarization (IES) task is vital for maintaining accurate, up-to-date knowledge. To address this, we introduce SUMIE, a fully synthetic dataset designed to expose real-world IES challenges. This dataset effectively highlights problems like incorrect entity association and incomplete information presentation. Unlike common synthetic datasets, ours captures the complexity and nuances found in real-world data. We generate informative and diverse attributes, summaries, and unstructured paragraphs in sequence, ensuring high quality. The alignment between generated summaries and paragraphs exceeds 96%, confirming the dataset's quality. Extensive experiments demonstrate the dataset's difficulty - state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Data Management and Algorithms · Topic Modeling