Using machine learning and information visualisation for discovering latent topics in Twitter news
Vladimir Vargas-Calder\'on, Marlon Steibeck Dominguez, N., Parra-A., Herbert Vinck-Posada, Jorge E. Camargo

TL;DR
This paper presents a combined machine learning and visualization approach to identify and interpret latent topics in Twitter news data, demonstrated through Colombian media tweets from 2014 to 2019.
Contribution
It introduces a novel method integrating LDA and FastText with K-means clustering for topic discovery and visualization in large tweet collections.
Findings
People respond differently to various news topics.
The Colombian peace treaty topic elicited significant public engagement.
Supervised classification can predict news topics from replies.
Abstract
We propose a method to discover latent topics and visualise large collections of tweets for easy identification and interpretation of topics, and exemplify its use with tweets from a Colombian mass media giant in the period 2014--2019. The latent topic analysis is performed in two ways: with the training of a Latent Dirichlet Allocation model, and with the combination of the FastText unsupervised model to represent tweets as vectors and the implementation of K-means clustering to group tweets into topics. Using a classification task, we found that people respond differently according to the various news topics. The classification tasks consists of the following: given a reply to a news tweet, we train a supervised algorithm to predict the topic of the news tweet solely from the reply. Furthermore, we show how the Colombian peace treaty has had a profound impact on the Colombian society,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Media and Politics · Complex Network Analysis Techniques · Communication and COVID-19 Impact
Methodsk-Means Clustering · fastText
