Simulation, Modelling and Classification of Wiki Contributors: Spotting The Good, The Bad, and The Ugly
Silvia Garc\'ia M\'endez, F\'atima Leal, Benedita Malheiro, Juan, Carlos Burguillo Rial, Bruno Veloso, Adriana E. Chis, Horacio Gonz\'alez, V\'elez

TL;DR
This paper introduces a simulation and classification framework to identify malicious and benign contributors in crowdsourced data platforms like wikis, using data balancing and stream modelling to improve accuracy.
Contribution
It presents a novel approach combining data fabrication, stream modelling, and autonomous classification to enhance detection of malicious contributors in open collaborative environments.
Findings
Achieved up to 92% classification accuracy.
Significantly improved classifier confidence with class-balanced data.
Effective differentiation between human and bot contributors.
Abstract
Data crowdsourcing is a data acquisition process where groups of voluntary contributors feed platforms with highly relevant data ranging from news, comments, and media to knowledge and classifications. It typically processes user-generated data streams to provide and refine popular services such as wikis, collaborative maps, e-commerce sites, and social networks. Nevertheless, this modus operandi raises severe concerns regarding ill-intentioned data manipulation in adversarial environments. This paper presents a simulation, modelling, and classification approach to automatically identify human and non-human (bots) as well as benign and malign contributors by using data fabrication to balance classes within experimental data sets, data stream modelling to build and update contributor profiles and, finally, autonomic data stream classification. By employing WikiVoyage - a free worldwide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEmirates Airlines Office in Dubai
