RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage
Peter Yong Zhong, Siyuan Chen, Ruiqi Wang, McKenna McCall, Ben L., Titzer, Heather Miller, Phillip B. Gibbons

TL;DR
RTBAS is a novel system that enhances the security of Tool-Based Agent Systems by automatically detecting prompt injection and privacy leaks, significantly reducing attack success while maintaining task performance.
Contribution
It introduces RTBAS, a new defense mechanism that uses information flow control and novel dependency screeners to protect LLM agents from prompt injection and privacy breaches.
Findings
Prevents all targeted prompt injection attacks in experiments.
Achieves only 2% utility loss under attack conditions.
Detects privacy leaks with near-oracle accuracy.
Abstract
Tool-Based Agent Systems (TBAS) allow Language Models (LMs) to use external tools for tasks beyond their standalone capabilities, such as searching websites, booking flights, or making financial transactions. However, these tools greatly increase the risks of prompt injection attacks, where malicious content hijacks the LM agent to leak confidential data or trigger harmful actions. Existing defenses (OpenAI GPTs) require user confirmation before every tool call, placing onerous burdens on users. We introduce Robust TBAS (RTBAS), which automatically detects and executes tool calls that preserve integrity and confidentiality, requiring user confirmation only when these safeguards cannot be ensured. RTBAS adapts Information Flow Control to the unique challenges presented by TBAS. We present two novel dependency screeners, using LM-as-a-judge and attention-based saliency, to overcome these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Blockchain Technology Applications and Security
