Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
Jinwei Hu, Yi Dong, Zhengtao Ding, Xiaowei Huang

TL;DR
This paper introduces a randomized smoothing framework to improve the robustness and safety of large language model-driven multi-agent systems in critical applications, providing probabilistic guarantees against adversarial attacks.
Contribution
It applies randomized smoothing to multi-agent systems with LLMs, offering a scalable, black-box robustness certification method for safety-critical domains.
Findings
Prevents adversarial propagation and hallucinations in MAS.
Maintains consensus performance under adversarial influence.
Operates efficiently with adaptive sampling.
Abstract
This paper presents a defense framework for enhancing the safety of large language model (LLM) empowered multi-agent systems (MAS) in safety-critical domains such as aerospace. We apply randomized smoothing, a statistical robustness certification technique, to the MAS consensus context, enabling probabilistic guarantees on agent decisions under adversarial influence. Unlike traditional verification methods, our approach operates in black-box settings and employs a two-stage adaptive sampling mechanism to balance robustness and computational efficiency. Simulation results demonstrate that our method effectively prevents the propagation of adversarial behaviors and hallucinations while maintaining consensus performance. This work provides a practical and scalable path toward safe deployment of LLM-based MAS in real-world, high-stakes environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
