ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering
Hussein Jawad, Nicolas J-B Brunel

TL;DR
ToolFlood is a novel attack that overwhelms LLM tool retrieval systems by injecting semantically covering malicious tools, significantly disrupting the selection process with minimal injection effort.
Contribution
This paper introduces ToolFlood, a new adversarial attack method that exploits embedding space geometry to dominate tool retrieval in LLM agents, highlighting robustness vulnerabilities.
Findings
Achieves up to 95% attack success rate on benchmarks
Effective with as low as 1% injection rate
Theoretically analyzes retrieval saturation effects
Abstract
Large Language Model (LLM) agents increasingly use external tools for complex tasks and rely on embedding-based retrieval to select a small top-k subset for reasoning. As these systems scale, the robustness of this retrieval stage is underexplored, even though prior work has examined attacks on tool selection. This paper introduces ToolFlood, a retrieval-layer attack on tool-augmented LLM agents. Rather than altering which tool is chosen after retrieval, ToolFlood overwhelms retrieval itself by injecting a few attacker-controlled tools whose metadata is carefully placed by exploiting the geometry of embedding space. These tools semantically span many user queries, dominate the top-k results, and push all benign tools out of the agent's context. ToolFlood uses a two-phase adversarial tool generation strategy. It first samples subsets of target queries and uses an LLM to iteratively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education
