Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events
Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han

TL;DR
This paper presents EpiMine, an unsupervised framework that combines episodic structure detection with large language models to improve large-scale news event identification, addressing interpretability and adaptability challenges.
Contribution
Introducing the novel task of episode detection in news corpora and developing EpiMine, which leverages natural episodic partitions and LLMs for enhanced event detection.
Findings
EpiMine achieves a 59.2% average gain over baselines.
EpiMine effectively identifies cohesive episodes in diverse datasets.
The framework improves interpretability and adaptability in event detection.
Abstract
State-of-the-art automatic event detection struggles with interpretability and adaptability to evolving large-scale key events -- unlike episodic structures, which excel in these areas. Often overlooked, episodes represent cohesive clusters of core entities performing actions at a specific time and location; a partially ordered sequence of episodes can represent a key event. This paper introduces a novel task, episode detection, which identifies episodes within a news corpus of key event articles. Detecting episodes poses unique challenges, as they lack explicit temporal or locational markers and cannot be merged using semantic similarity alone. While large language models (LLMs) can aid with these reasoning difficulties, they suffer with long contexts typical of news corpora. To address these challenges, we introduce EpiMine, an unsupervised framework that identifies a key event's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVideo Analysis and Summarization · Network Security and Intrusion Detection · Web Data Mining and Analysis
