TIM: A Large-Scale Dataset and large Timeline Intelligence Model for Open-domain Timeline Summarization
Chuanrui Hu, Wei Hu, Penghang Yu, Hua Zhang, Bing-Kun Bao

TL;DR
This paper introduces TIM, a large-scale dataset and a novel timeline intelligence model designed for open-domain timeline summarization, effectively capturing topic evolution and filtering irrelevant information.
Contribution
The paper presents the first large-scale TLS dataset and a new TIM model with a progressive optimization strategy, including instruction tuning and dual-alignment reward learning.
Findings
TIM outperforms existing methods in open-domain timeline summarization.
The dataset contains over 1,000 news topics and 3,000 annotated instances.
The model effectively captures topic evolution and filters irrelevant details.
Abstract
Open-domain Timeline Summarization (TLS) is crucial for monitoring the evolution of news topics. To identify changes in news topics, existing methods typically employ general Large Language Models (LLMs) to summarize relevant timestamps from retrieved news. While general LLMs demonstrate capabilities in zero-shot news summarization and timestamp localization, they struggle with assessing topic relevance and understanding topic evolution. Consequently, the summarized information often includes irrelevant details or inaccurate timestamps. To address these issues, we propose the first large Timeline Intelligence Model (TIM) for open-domain TLS, which is capable of effectively summarizing open-domain timelines. Specifically, we begin by presenting a large-scale TLS dataset, comprising over 1,000 news topics and more than 3,000 annotated TLS instances. Furthermore, we propose a progressive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Machine Learning in Healthcare
