When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms
Qibing Ren, Zhijie Zheng, Jiaxuan Guo, Junchi Yan, Lizhuang Ma, Jing Shao

TL;DR
This paper examines the risks of collaborative financial fraud by large language model agents on social platforms, introduces a comprehensive benchmark, and proposes mitigation strategies.
Contribution
It presents MultiAgentFraudBench, a large-scale benchmark for simulating online fraud scenarios involving LLM agents, and analyzes factors influencing fraud success and mitigation methods.
Findings
Malicious agents can adapt to mitigation strategies.
The benchmark covers 28 fraud scenarios across the fraud lifecycle.
Content warnings and LLM monitors can reduce fraud success.
Abstract
In this work, we study the risks of collective financial fraud in large-scale multi-agent systems powered by large language model (LLM) agents. We investigate whether agents can collaborate in fraudulent behaviors, how such collaboration amplifies risks, and what factors influence fraud success. To support this research, we present MultiAgentFraudBench, a large-scale benchmark for simulating financial fraud scenarios based on realistic online interactions. The benchmark covers 28 typical online fraud scenarios, spanning the full fraud lifecycle across both public and private domains. We further analyze key factors affecting fraud success, including interaction depth, activity level, and fine-grained collaboration failure modes. Finally, we propose a series of mitigation strategies, including adding content-level warnings to fraudulent posts and dialogues, using LLMs as monitors to block…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
