Neural Total Variation Distance Estimators for Changepoint Detection in News Data
Csaba Zsolnai, Niels L\"orch, Julian Arnold

TL;DR
This paper introduces a neural network-based method for detecting societal shifts in news content by estimating total variation distance, effectively identifying major historical events with minimal domain knowledge.
Contribution
It presents a novel neural total variation distance estimator using a learning-by-confusion scheme for changepoint detection in high-dimensional news data.
Findings
Successfully detects major events like 9/11, COVID-19, and elections.
Works on synthetic and real-world datasets with minimal domain knowledge.
Provides a quantitative measure of content change.
Abstract
Detecting when public discourse shifts in response to major events is crucial for understanding societal dynamics. Real-world data is high-dimensional, sparse, and noisy, making changepoint detection in this domain a challenging endeavor. In this paper, we leverage neural networks for changepoint detection in news data, introducing a method based on the so-called learning-by-confusion scheme, which was originally developed for detecting phase transitions in physical systems. We train classifiers to distinguish between articles from different time periods. The resulting classification accuracy is used to estimate the total variation distance between underlying content distributions, where significant distances highlight changepoints. We demonstrate the effectiveness of this method on both synthetic datasets and real-world data from The Guardian newspaper, successfully identifying major…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
