Real-Time Summarization of Twitter
Yixin Jin, Meiqi Wang, Meng Li, Wenjing Zhou, Yi Shen, Hao Liu

TL;DR
This paper presents a real-time Twitter summarization system that classifies relevant tweets using Dirichlet scoring and aims to improve notification relevance and reduce redundancy.
Contribution
It introduces a Dirichlet score-based classification approach for real-time Twitter summarization and discusses algorithms for removing redundant tweets.
Findings
The approach achieves good performance on MAP, CG, and DCG metrics.
Dirichlet scoring effectively classifies relevant tweets.
Redundancy removal algorithms are proposed for improved notification quality.
Abstract
In this paper, we describe our approaches to TREC Real-Time Summarization of Twitter. We focus on real time push notification scenario, which requires a system monitors the stream of sampled tweets and returns the tweets relevant and novel to given interest profiles. Dirichlet score with and with very little smoothing (baseline) are employed to classify whether a tweet is relevant to a given interest profile. Using metrics including Mean Average Precision (MAP, cumulative gain (CG) and discount cumulative gain (DCG), the experiment indicates that our approach has a good performance. It is also desired to remove the redundant tweets from the pushing queue. Due to the precision limit, we only describe the algorithm in this paper.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Text Analysis Techniques · Complex Network Analysis Techniques
MethodsFocus
