Anthropocentric bias in language model evaluation
Rapha\"el Milli\`ere, Charles Rathkopf

TL;DR
This paper discusses overlooked anthropocentric biases in evaluating large language models, emphasizing the need for empirical, mechanistic approaches to accurately assess their true cognitive capacities.
Contribution
It identifies two neglected biases—auxiliary oversight and mechanistic chauvinism—and proposes an iterative, empirical methodology to mitigate these biases in LLM evaluation.
Findings
Identification of two key anthropocentric biases in LLM evaluation
Proposal of an empirical, mechanistic approach for better assessment
Emphasis on supplementing behavioral experiments with mechanistic studies
Abstract
Evaluating the cognitive capacities of large language models (LLMs) requires overcoming not only anthropomorphic but also anthropocentric biases. This article identifies two types of anthropocentric bias that have been neglected: overlooking how auxiliary factors can impede LLM performance despite competence ("auxiliary oversight"), and dismissing LLM mechanistic strategies that differ from those of humans as not genuinely competent ("mechanistic chauvinism"). Mitigating these biases necessitates an empirically-driven, iterative approach to mapping cognitive tasks to LLM-specific capacities and mechanisms, which can be done by supplementing carefully designed behavioral experiments with mechanistic studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
