Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

Yuxuan Qiao; Dongqin Liu; Hongchang Yang; Wei Zhou; Songlin Hu

arXiv:2512.16310·cs.CR·March 9, 2026

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

Yuxuan Qiao, Dongqin Liu, Hongchang Yang, Wei Zhou, Songlin Hu

PDF

Open Access

TL;DR

This paper systematically studies the privacy risks in multi-tool autonomous agents driven by large language models, revealing pervasive information leakage and proposing mitigation strategies to enhance safety.

Contribution

It introduces the first formal framework and benchmark for Tools Orchestration Privacy Risk (TOP-R), analyzes leakage causes, and proposes effective mitigation methods.

Findings

01

Average leakage rate of 62.11% across models

02

Mitigation strategies improve H-Score to 79.20%

03

Identifies key causes: privacy awareness, reasoning overshoot, inference inertia

Abstract

Driven by Large Language Models, the single-agent, multi-tool architecture has become a popular paradigm for autonomous agents. However, this architecture introduces a severe privacy risk, which we term Tools Orchestration Privacy Risk (TOP-R): an agent, to achieve a benign user goal, autonomously aggregates non-sensitive fragments from multiple tools and synthesizes unexpected sensitive information. We provide the first systematic study of this risk. We establish a formal framework characterizing TOP-R through three necessary conditions -- conclusion sensitivity, single-source non-inferability, and compositional inferability. We construct TOP-Bench via a Reverse Inference Seed Expansion (RISE) pipeline, incorporating paired social-context scenarios for diagnostic analysis. We further introduce the H-Score, a harmonic mean of task completion and safety, to quantify the utility-safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI