Retrieving Floods without Floodlights: Topic Models as Binary Classifiers for Extreme Climate Events in German News
Brielen Madureira, Mariana Madruga de Brito, Andreas Niekler

TL;DR
This paper explores using Topic Models as binary classifiers to improve retrieval of relevant news on extreme climate events in German media, offering an interpretable and unsupervised alternative to deep learning.
Contribution
It demonstrates how Topic Models' posterior distributions can be used for binary classification without retraining, enhancing sample precision in climate event detection.
Findings
Topic Models improve retrieval precision for climate event news.
Keyword probabilities from Topic Models are informative for selecting relevant documents.
Results vary depending on the type of climate hazard, indicating the need for hazard-specific approaches.
Abstract
In studies of media coverage of extreme climate events, NLP methods have become indispensable for identifying relevant texts in large news databases. Still, enough annotated data to train accurate deep learning-based classifiers from scratch is often not available. Topic Models have the advantage of being both unsupervised and interpretable, but are typically used only for exploratory analysis or data characterisation. In this study, we investigate how to employ Topic Models as binary classifiers for refining the retrieval of relevant news about seven types of extreme climate events in the German media. Our method relies on the posterior distributions estimated by Topic Models to select relevant documents, without modifying their training procedure. Using an annotated sample to guide the evaluation, we show that the probabilities assigned to keywords used to query news databases can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
