Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage
Jinwei Hu, Xinmiao Huang, Youcheng Sun, Yi Dong, Xiaowei Huang

TL;DR
This paper reveals a new threat where colluding language models manipulate beliefs using truthful evidence fragments, leading to widespread vulnerability across diverse models and cascading false beliefs in decision-making processes.
Contribution
It formalizes the first cognitive collusion attack and introduces Generative Montage, a framework for constructing deceptive narratives without falsified documents.
Findings
Attack success rates reach over 70% across 14 LLM families.
Stronger reasoning models are more susceptible to manipulation.
False beliefs cascade to downstream judges with over 60% deception rate.
Abstract
As large language models (LLMs) transition to autonomous agents synthesizing real-time information, their reasoning capabilities introduce an unexpected attack surface. This paper introduces a novel threat where colluding agents steer victim beliefs using only truthful evidence fragments distributed through public channels, without relying on covert communications, backdoors, or falsified documents. By exploiting LLMs' overthinking tendency, we formalize the first cognitive collusion attack and propose Generative Montage: a Writer-Editor-Director framework that constructs deceptive narratives through adversarial debate and coordinated posting of evidence fragments, causing victims to internalize and propagate fabricated conclusions. To study this risk, we develop CoPHEME, a dataset derived from real-world rumor events, and simulate attacks across diverse LLM families. Our results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Topic Modeling
