Epistemic diversity across language models mitigates knowledge collapse
Damian Hodel, Jevin D. West

TL;DR
Increasing diversity among language models in AI ecosystems helps prevent knowledge collapse by maintaining performance over time, especially as models and data scale up, highlighting the importance of ecosystem heterogeneity.
Contribution
This work demonstrates that ecosystem diversity among language models mitigates knowledge collapse, providing empirical evidence across various settings and suggesting strategies for sustainable AI knowledge production.
Findings
Single-model training short-term gains but long-term collapse
Optimal diversity increases with training iterations
Scaling up amplifies collapse in homogeneous ecosystems
Abstract
As artificial intelligence (AI) becomes more widely used, concerns are growing that model collapse could lead to knowledge collapse, i.e. a degradation to a narrow and inaccurate set of ideas. Prior work has demonstrated single-model collapse, defined as performance decay in an AI model trained on its own outputs. Inspired by ecology, we ask whether increasing AI ecosystem diversity (i.e., the number of distinct models) can mitigate such collapse. To study the effect of diversity on model performance, we extend the single-model approach by segmenting the training data across an increasing number of language models and evaluating the resulting ecosystems of models over ten self-training iterations. We find that training a single model on the entire dataset improves performance only in the short term but amplifies collapse over longer horizons. Specifically, we observe that the optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Computational and Text Analysis Methods · Domain Adaptation and Few-Shot Learning
