The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise
Matteo Lupinacci, Francesco Aurelio Pironti, Francesco Blefari, Francesco Romeo, Luigi Arena, Angelo Furfaro

TL;DR
This paper investigates security vulnerabilities in LLM-based autonomous agents, demonstrating how adversaries can exploit these models to achieve system-level compromises through various attack vectors.
Contribution
It provides a comprehensive evaluation of attack surfaces in LLM agents, revealing widespread vulnerabilities and novel insights into multi-agent system security risks.
Findings
94.4% of models vulnerable to Direct Prompt Injection
83.3% susceptible to RAG Backdoor Attack
100% can be compromised via Inter-Agent Trust Exploitation
Abstract
The rapid adoption of Large Language Model (LLM) agents and multi-agent systems enables remarkable capabilities in natural language processing and generation. However, these systems introduce security vulnerabilities that extend beyond traditional content generation to system-level compromises. This paper presents a comprehensive evaluation of the LLMs security used as reasoning engines within autonomous agents, highlighting how they can be exploited as attack vectors capable of achieving computer takeovers. We focus on how different attack surfaces and trust boundaries can be leveraged to orchestrate such takeovers. We demonstrate that adversaries can effectively coerce popular LLMs into autonomously installing and executing malware on victim machines. Our evaluation of 18 state-of-the-art LLMs reveals that 94.4% of models succumb to Direct Prompt Injection, and 83.3% are vulnerable to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
