The Subjectivity of Monoculture
Nathanael Jo, Nikhil Garg, Manish Raghavan

TL;DR
This paper explores the subjectivity in evaluating model agreement, showing that conclusions about monoculture depend heavily on the choice of null models and evaluation context, which affects how we interpret model similarity.
Contribution
It demonstrates that assessments of model agreement are subjective and context-dependent, emphasizing the importance of null model choice and evaluation setting in monoculture analysis.
Findings
Different null models lead to vastly different inferences about model agreement.
Model correlations vary significantly across different question sets and peer groups.
Experimental results on large benchmarks validate the theoretical claims about subjectivity in monoculture evaluation.
Abstract
Machine learning models -- including large language models (LLMs) -- are often said to exhibit monoculture, where outputs agree strikingly often. But what does it actually mean for models to agree too much? We argue that this question is inherently subjective, relying on two key decisions. First, the analyst must specify a baseline null model for what "independence" should look like. This choice is inherently subjective, and as we show, different null models result in dramatically different inferences about excess agreement. Second, we show that inferences depend on the population of models and items under consideration. Models that seem highly correlated in one context may appear independent when evaluated on a different set of questions, or against a different set of peers. Experiments on two large-scale benchmarks validate our theoretical findings. For example, we find drastically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Mobile Crowdsensing and Crowdsourcing
