Early Discovery of Emerging Entities in Persian Twitter with Semantic Similarity
Shahin Yousefi, Mohsen Hooshmand, Mohsen Afsharchi

TL;DR
This paper introduces EEPT, an online clustering method for early detection of emerging entities in Persian Twitter, which does not require training data and uses a new evaluation metric to identify significant entities before they are fully established.
Contribution
The paper presents EEPT, a novel training-free online clustering approach for early emerging entity detection in social media, along with a new metric for evaluation.
Findings
EEPT effectively discovers emerging entities before their establishment.
The new metric provides a better evaluation framework for early detection methods.
EEPT shows promising results in Persian Twitter data.
Abstract
Discovering emerging entities (EEs) is the problem of finding entities before their establishment. These entities can be critical for individuals, companies, and governments. Many of these entities can be discovered on social media platforms, e.g. Twitter. These identities have been the spot of research in academia and industry in recent years. Similar to any machine learning problem, data availability is one of the major challenges in this problem. This paper proposes EEPT. That is an online clustering method able to discover EEs without any need for training on a dataset. Additionally, due to the lack of a proper evaluation metric, this paper uses a new metric to evaluate the results. The results show that EEPT is promising and finds significant entities before their establishment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Text Analysis Techniques · Spam and Phishing Detection
