SPILLage: Agentic Oversharing on the Web

Jaechul Roh; Eugene Bagdasarian; Hamed Haddadi; Ali Shahin Shamsabadi

arXiv:2602.13516·cs.AI·February 17, 2026

SPILLage: Agentic Oversharing on the Web

Jaechul Roh, Eugene Bagdasarian, Hamed Haddadi, Ali Shahin Shamsabadi

PDF

Open Access 3 Reviews

TL;DR

This paper introduces SPILLage, a framework for analyzing and benchmarking the unintentional disclosure of user information by web agents through content and behavioral oversharing, highlighting its prevalence and impact on task success.

Contribution

The paper formalizes the concept of agentic oversharing, introduces the SPILLage framework, and provides empirical benchmarks showing behavioral oversharing dominates content oversharing across various tasks and models.

Findings

01

Behavioral oversharing exceeds content oversharing by 5x.

02

Oversharing persists and worsens under prompt mitigation.

03

Removing task-irrelevant info improves task success by up to 17.9%.

Abstract

LLM-powered agents are beginning to automate user's tasks across the open web, often with access to user resources such as emails and calendars. Unlike standard LLMs answering questions in a controlled ChatBot setting, web agents act "in the wild", interacting with third parties and leaving behind an action trace. Therefore, we ask the question: how do web agents handle user resources when accomplishing tasks on their behalf across live websites? In this paper, we formalize Natural Agentic Oversharing -- the unintentional disclosure of task-irrelevant user information through an agent trace of actions on the web. We introduce SPILLage, a framework that characterizes oversharing along two dimensions: channel (content vs. behavior) and directness (explicit vs. implicit). This taxonomy reveals a critical blind spot: while prior work focuses on text leakage, web agents also overshare…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 5

Strengths

Studying whether LLM agents overshare private information conveyed by the user is very relevant. I particularly like the behavioral approach, where the authors don’t set up the problem in an adversarial setting, but in a naturally occurring way. These cases are pervasive, and we’re not able to control for them (e.g., another model can detect the issue before the agent acts). The writing is simple, clear, and easy to follow.

Weaknesses

As I said in the strengths, I really like the idea and the approach, but what you have so far can only be considered a pilot. I encourage the authors to improve this work, as I’d like to see it published at a top venue. - The paper is very LLM-heavy, meaning that you use them for absolutely everything. You generate a synthetic dataset using one (Claude 3.7 Sonnet), run experiments with three (o3, o4-mini, GPT-4o), and evaluate the results with an LLM-as-judge (GPT-4.1 Mini). I think this shows

Reviewer 02Rating 4Confidence 4

Strengths

- The study on contextual privacy leakage in the context of web agents is important and much needed. The coverage of explicit and implicit leakage and leakage through text disclosure or actions is comprehensive. - The evaluation on real-world websites provides further insights into how such privacy leakage can occur with current web agents in a realistic environment. - The ablation study showed removing inappropriate information preserves privacy and improves utility at the same time, suggesting

Weaknesses

- The definition of "inappropriate information" feels vague and anecdotal. I didn't find a systematic process for judging what is appropriate or inappropriate to share and reasonable justifications for the process. In many examples presented in the paper, I feel the disclosed information is borderline sensitive, and some even seems necessary for the task (e.g., selecting a price range is a common action to do when the user is on a budget, while this is considered an implicit behavioral leakage a

Reviewer 03Rating 2Confidence 4

Strengths

- Testing agent's oversharing in real-world web environment is certainly important. This real-world benchmark allows doing that (although I do have some concerns on this listed in the next section). - Taxonomy of oversharing categories is quite interesting and it provides a nuanced understanding of the issue.

Weaknesses

- I don't think the use of live websites is a good idea here. It makes it challenging to control variables and reproduce results, which is a crucial aspect of scientific experimentation. Whereas environments such as BrowserGym, WebArena allow this. - By looking into examples, the definition and detection of implicit (both content and behavior) oversharing may be subjective, and it's unclear whether humans would perform better in similar situations. For example, for a repeated search of products

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Web Data Mining and Analysis · Privacy, Security, and Data Protection