Bandit algorithms for real-time data capture on large social medias
Thibault Gisselbrecht

TL;DR
This paper develops and compares several bandit algorithms for real-time social media data collection, focusing on selecting relevant user accounts under resource constraints to maximize information quality.
Contribution
It introduces multiple bandit models, including contextual and non-stationary variants, for dynamic user selection in social media data capture.
Findings
Models outperform baseline methods on artificial datasets.
Contextual bandits improve relevance of captured data.
Latent space models capture complex user interactions.
Abstract
We study the problem of real time data capture on social media. Due to the different limitations imposed by those media, but also to the very large amount of information, it is impossible to collect all the data produced by social networks such as Twitter. Therefore, to be able to gather enough relevant information related to a predefined need, it is necessary to focus on a subset of the information sources. In this work, we focus on user-centered data capture and consider each account of a social network as a source that can be listened to at each iteration of a data capture process, in order to collect the corresponding produced contents. This process, whose aim is to maximize the quality of the information gathered, is constrained by the number of users that can be monitored simultaneously. The problem of selecting a subset of accounts to listen to over time is a sequential decision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing
