Automatic Quality Assessment of Wikipedia Articles -- A Systematic   Literature Review

Pedro Miguel Mo\'as; Carla Teixeira Lopes

arXiv:2310.02235·cs.CL·October 4, 2023

Automatic Quality Assessment of Wikipedia Articles -- A Systematic Literature Review

Pedro Miguel Mo\'as, Carla Teixeira Lopes

PDF

TL;DR

This paper systematically reviews 149 studies on automatic quality assessment methods for Wikipedia articles, highlighting current approaches, gaps, and the limited use of machine learning in this domain.

Contribution

It provides a comprehensive comparison of existing methods, identifies research gaps, and encourages increased adoption of machine learning for Wikipedia quality assessment.

Findings

01

Machine learning is underutilized in Wikipedia quality assessment.

02

Existing methods rely heavily on article features and quality metrics.

03

The literature shows a trend towards technological evolution but lacks widespread ML adoption.

Abstract

Wikipedia is the world's largest online encyclopedia, but maintaining article quality through collaboration is challenging. Wikipedia designed a quality scale, but with such a manual assessment process, many articles remain unassessed. We review existing methods for automatically measuring the quality of Wikipedia articles, identifying and comparing machine learning algorithms, article features, quality metrics, and used datasets, examining 149 distinct studies, and exploring commonalities and gaps in them. The literature is extensive, and the approaches follow past technological trends. However, machine learning is still not widely used by Wikipedia, and we hope that our analysis helps future researchers change that reality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.