AgentSCOPE: Evaluating Contextual Privacy Across Agentic Workflows
Ivoline C. Ngong, Keerthiram Murugesan, Swanand Kadhe, Justin D. Weisz, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy

TL;DR
AgentSCOPE introduces a framework and benchmark for evaluating privacy risks at every stage of agentic workflows, revealing frequent violations even when outputs seem safe.
Contribution
The paper presents the Privacy Flow Graph framework and AgentSCOPE benchmark to assess privacy violations throughout agentic system pipelines, highlighting overlooked risks.
Findings
Privacy violations occur in over 80% of scenarios.
Most violations happen at the tool-response stage.
Output-level evaluation underestimates privacy risks.
Abstract
Agentic systems are increasingly acting on users' behalf, accessing calendars, email, and personal files to complete everyday tasks. Privacy evaluation for these systems has focused on the input and output boundaries, but each task involves several intermediate information flows, from agent queries to tool responses, that are not currently evaluated. We argue that every boundary in an agentic pipeline is a site of potential privacy violation and must be assessed independently. To support this, we introduce the Privacy Flow Graph, a Contextual Integrity-grounded framework that decomposes agentic execution into a sequence of information flows, each annotated with the five CI parameters, and traces violations to their point of origin. We present AgentSCOPE, a benchmark of 62 multi-tool scenarios across eight regulatory domains with ground truth at every pipeline stage. Our evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Scientific Computing and Data Management · Security and Verification in Computing
