Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

Joshua Ashkinaze; Ruijia Guan; Laura Kurek; Eytan Adar; Ceren Budak; Eric Gilbert

arXiv:2407.04183·cs.CL·May 11, 2026·1 cites

Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

Joshua Ashkinaze, Ruijia Guan, Laura Kurek, Eytan Adar, Ceren Budak, Eric Gilbert

PDF

TL;DR

This paper evaluates LLMs' ability to detect and correct biased Wikipedia edits according to neutrality norms, revealing strengths in generation but challenges in bias detection and nuanced rule application.

Contribution

It provides a detailed analysis of LLMs' performance in applying Wikipedia's neutrality policies, highlighting limitations and biases in their normative understanding.

Findings

01

LLMs achieved 64% accuracy in bias detection.

02

Models removed 79% of words removed by Wikipedia editors.

03

Crowdworkers rated AI rewrites as more neutral and fluent.

Abstract

Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms. Is providing LLMs with community rules enough for models to follow these norms? We evaluate LLMs' capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia's Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% accuracy on a balanced dataset. Models exhibited contrasting biases (some under- and others over-predicted bias), suggesting distinct priors about neutrality. LLMs performed better at generation, removing 79% of words removed by Wikipedia editors. However, LLMs made additional changes beyond Wikipedia editors' simpler neutralizations, resulting in high-recall but low-precision editing. Interestingly, crowdworkers rated AI rewrites as more neutral (70%) and fluent (61%) than Wikipedia-editor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.