Evaluation of Thematic Coherence in Microblogs
Iman Munire Bilal, Bo Wang, Maria Liakata, Rob Procter, Adam, Tsakalidis

TL;DR
This paper introduces a new dataset and evaluation framework for assessing thematic coherence in microblog clusters, comparing various automated metrics and finding text generation metrics most reliable.
Contribution
It provides annotated microblog clusters, defines the evaluation task, and systematically compares multiple metrics, highlighting the effectiveness of text generation metrics.
Findings
Surface level metrics perform well but lack consistency.
Text generation metrics are more reliable across time windows.
Automated metrics vary in effectiveness for thematic coherence evaluation.
Abstract
Collecting together microblogs representing opinions about the same topics within the same timeframe is useful to a number of different tasks and practitioners. A major question is how to evaluate the quality of such thematic clusters. Here we create a corpus of microblog clusters from three different domains and time windows and define the task of evaluating thematic coherence. We provide annotation guidelines and human annotations of thematic coherence by journalist experts. We subsequently investigate the efficacy of different automated evaluation metrics for the task. We consider a range of metrics including surface level metrics, ones for topic model coherence and text generation metrics (TGMs). While surface level metrics perform well, outperforming topic coherence metrics, they are not as consistent as TGMs. TGMs are more reliable than all other metrics considered for capturing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Complex Network Analysis Techniques · Advanced Text Analysis Techniques
