Presenting a New Dataset for the Timeline Generation Problem
Xavier Holt, Will Radford, Ben Hachey

TL;DR
This paper introduces a new publicly available dataset of news articles and timelines for entities, along with an evaluation methodology using ROUGE, to advance research in timeline generation.
Contribution
The paper provides the first standard dataset and evaluation framework for entity timeline generation, addressing previous lack of resources.
Findings
Google results outperform baselines in timeline quality
Dataset includes 18,793 articles for 39 entities
ROUGE effectively evaluates timeline quality
Abstract
The timeline generation task summarises an entity's biography by selecting stories representing key events from a large pool of relevant documents. This paper addresses the lack of a standard dataset and evaluative methodology for the problem. We present and make publicly available a new dataset of 18,793 news articles covering 39 entities. For each entity, we provide a gold standard timeline and a set of entity-related articles. We propose ROUGE as an evaluation metric and validate our dataset by showing that top Google results outperform straw-man baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
