SPILLage: Agentic Oversharing on the Web
Jaechul Roh, Eugene Bagdasarian, Hamed Haddadi, Ali Shahin Shamsabadi

TL;DR
This paper introduces SPILLage, a framework for analyzing and benchmarking the unintentional disclosure of user information by web agents through content and behavioral oversharing, highlighting its prevalence and impact on task success.
Contribution
The paper formalizes the concept of agentic oversharing, introduces the SPILLage framework, and provides empirical benchmarks showing behavioral oversharing dominates content oversharing across various tasks and models.
Findings
Behavioral oversharing exceeds content oversharing by 5x.
Oversharing persists and worsens under prompt mitigation.
Removing task-irrelevant info improves task success by up to 17.9%.
Abstract
LLM-powered agents are beginning to automate user's tasks across the open web, often with access to user resources such as emails and calendars. Unlike standard LLMs answering questions in a controlled ChatBot setting, web agents act "in the wild", interacting with third parties and leaving behind an action trace. Therefore, we ask the question: how do web agents handle user resources when accomplishing tasks on their behalf across live websites? In this paper, we formalize Natural Agentic Oversharing -- the unintentional disclosure of task-irrelevant user information through an agent trace of actions on the web. We introduce SPILLage, a framework that characterizes oversharing along two dimensions: channel (content vs. behavior) and directness (explicit vs. implicit). This taxonomy reveals a critical blind spot: while prior work focuses on text leakage, web agents also overshare…
Peer Reviews
Decision·Submitted to ICLR 2026
Studying whether LLM agents overshare private information conveyed by the user is very relevant. I particularly like the behavioral approach, where the authors don’t set up the problem in an adversarial setting, but in a naturally occurring way. These cases are pervasive, and we’re not able to control for them (e.g., another model can detect the issue before the agent acts). The writing is simple, clear, and easy to follow.
As I said in the strengths, I really like the idea and the approach, but what you have so far can only be considered a pilot. I encourage the authors to improve this work, as I’d like to see it published at a top venue. - The paper is very LLM-heavy, meaning that you use them for absolutely everything. You generate a synthetic dataset using one (Claude 3.7 Sonnet), run experiments with three (o3, o4-mini, GPT-4o), and evaluate the results with an LLM-as-judge (GPT-4.1 Mini). I think this shows
- The study on contextual privacy leakage in the context of web agents is important and much needed. The coverage of explicit and implicit leakage and leakage through text disclosure or actions is comprehensive. - The evaluation on real-world websites provides further insights into how such privacy leakage can occur with current web agents in a realistic environment. - The ablation study showed removing inappropriate information preserves privacy and improves utility at the same time, suggesting
- The definition of "inappropriate information" feels vague and anecdotal. I didn't find a systematic process for judging what is appropriate or inappropriate to share and reasonable justifications for the process. In many examples presented in the paper, I feel the disclosed information is borderline sensitive, and some even seems necessary for the task (e.g., selecting a price range is a common action to do when the user is on a budget, while this is considered an implicit behavioral leakage a
- Testing agent's oversharing in real-world web environment is certainly important. This real-world benchmark allows doing that (although I do have some concerns on this listed in the next section). - Taxonomy of oversharing categories is quite interesting and it provides a nuanced understanding of the issue.
- I don't think the use of live websites is a good idea here. It makes it challenging to control variables and reproduce results, which is a crucial aspect of scientific experimentation. Whereas environments such as BrowserGym, WebArena allow this. - By looking into examples, the definition and detection of implicit (both content and behavior) oversharing may be subjective, and it's unclear whether humans would perform better in similar situations. For example, for a repeated search of products
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Web Data Mining and Analysis · Privacy, Security, and Data Protection
