AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents
Yixi Lin (1), Jiangrong Wu (1), Yuhong Nan (1), Xueqiang Wang (2), Xinyuan Zhang (1), Zibin Zheng (1) ((1) Sun Yat-sen University, Zhuhai, Guangdong, China, (2) University of Central Florida, Orlando, Florida, USA)

TL;DR
This paper introduces AgentRaft, an automated framework that detects data over-exposure risks in LLM agents by combining program analysis, semantic reasoning, and privacy regulation compliance, revealing systemic privacy risks.
Contribution
AgentRaft is the first automated tool to detect data over-exposure in LLM agents, integrating program analysis, semantic reasoning, and privacy regulation compliance.
Findings
DOE is prevalent in 57.07% of tool interaction paths
AgentRaft outperforms baselines by 87.24% in detection accuracy
Reaches 99% DOE coverage with only 150 prompts
Abstract
The rapid integration of Large Language Model (LLM) agents into autonomous task execution has introduced significant privacy concerns within cross-tool data flows. In this paper, we systematically investigate and define a novel risk termed Data Over-Exposure (DOE) in LLM Agent, where an Agent inadvertently transmits sensitive data beyond the scope of user intent and functional necessity. We identify that DOE is primarily driven by the broad data paradigms in tool design and the coarse-grained data processing inherent in LLMs. In this paper, we present AgentRaft, the first automated framework for detecting DOE risks in LLM agents. AgentRaft combines program analysis with semantic reasoning through three synergistic modules: (1) it constructs a Cross-Tool Function Call Graph (FCG) to model the interaction landscape of heterogeneous tools; (2) it traverses the FCG to synthesize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI
