A clustering approach to infer Wikipedia contributors' profile
Shubham Krishna, Romain Billot, Nicolas Jullien

TL;DR
This paper presents a clustering method to identify Wikipedia contributors' profiles using only their edits and activity patterns, applicable across languages, enabling early detection and improved community management.
Contribution
It introduces a simple, language-independent clustering approach based solely on edit data, facilitating early profile detection without complex manual coding.
Findings
Profiles are identifiable early in contributor history
Method is validated on Romanian and Danish Wikipedias
Profiles are stable and accurate across languages
Abstract
In online communities, recent studies have strongly improved our knowledge about the different types or profiles of contributors, from casual to very involved ones, through focused people. However they do so by using very complex methodologies (qualitative-quantitative mix, with a high workload to manually codify/characterize the edits), making their replication for the practitioners limited. These studies are on the English Wikipedia only. The objective of this paper is to highlight different profiles of contributors with clustering techniques. The originality is to show how using only the edits, and their distribution over time, allows to build these contributors profiles with a good accuracy and stability amongst languages. The methodology is validated with both Romanian and Danish wikis. The highlighted profiles are identifiable early in the history of involvement, suggesting that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
