Multi-LLM Systems Exhibit Robust Semantic Collapse
Weiyi Kong, Shiyang Lai, Jinghua Piao, James Evans

TL;DR
Multi-LLM systems in closed loops tend to converge semantically, limiting their ability to produce diverse, novel content despite various intervention strategies, due to intrinsic autoregressive properties.
Contribution
This study empirically demonstrates the phenomenon of semantic collapse in multi-LLM systems and analyzes its underlying mechanisms, highlighting fundamental constraints.
Findings
Semantic collapse occurs across model families in 200-1000 rounds.
Intervention strategies fail to prevent semantic convergence.
Intrinsic autoregressive properties explain the collapse.
Abstract
Whether machines can originate novel content has been debated for nearly two centuries, from Lovelace's assertion that no engine can "originate anything" to Turing's question of whether a machine can amplify ideas brought in from outside. Multi-large language model (LLM) systems, increasingly deployed for autonomous generation, reopen this question empirically. Here we show that such systems, operating in closed loops, exhibit semantic collapse: systematic convergence in semantic representations despite apparent lexical variation. Across model families, extended simulations of 200 to 1,000 rounds, the pattern remains consistent. Twelve intervention strategies, spanning decoding parameters, prompt design, agent composition, activation engineering, and reinforcement learning, fail to restore semantic diversity. Mechanistic analyses suggest that semantic collapse is not explained by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
