EventSum: A Large-Scale Event-Centric Summarization Dataset for Chinese Multi-News Documents
Mengna Zhu, Kaisheng Zeng, Mao Wang, Kaiming Xiao, Lei Hou, and Hongbin Huang, Juanzi Li

TL;DR
This paper introduces EventSum, a large-scale Chinese dataset for multi-document event summarization, and evaluates advanced language models' performance, highlighting the task's complexity and the importance of specialized evaluation metrics.
Contribution
The paper presents the first large-scale Chinese multi-document summarization dataset focused on events and proposes new evaluation metrics tailored for event-centric summaries.
Findings
Existing LLMs struggle with event-centric multi-document summarization.
Designed metrics effectively evaluate summary comprehensiveness.
EventSum dataset facilitates future research in this area.
Abstract
In real life, many dynamic events, such as major disasters and large-scale sports events, evolve continuously over time. Obtaining an overview of these events can help people quickly understand the situation and respond more effectively. This is challenging because the key information of the event is often scattered across multiple documents, involving complex event knowledge understanding and reasoning, which is under-explored in previous work. Therefore, we proposed the Event-Centric Multi-Document Summarization (ECS) task, which aims to generate concise and comprehensive summaries of a given event based on multiple related news documents. Based on this, we constructed the EventSum dataset, which was constructed using Baidu Baike entries and underwent extensive human annotation, to facilitate relevant research. It is the first large scale Chinese multi-document summarization dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Text Analysis Techniques · Computational and Text Analysis Methods · Topic Modeling
