POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents

Qiaoyuan Zheng; Yiqu Yang; Qi Gao; Imanol Schlag

arXiv:2605.19127·cs.AI·May 20, 2026

POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents

Qiaoyuan Zheng, Yiqu Yang, Qi Gao, Imanol Schlag

PDF

TL;DR

POLAR-Bench is a diagnostic benchmark designed to evaluate privacy-utility trade-offs in LLM agents, revealing how well models protect private data under adversarial probing across multiple domains.

Contribution

The paper introduces POLAR-Bench, a new benchmark for assessing privacy and utility in LLMs with adversarial testing across diverse domains.

Findings

01

Frontier models withhold over 99% of protected attributes.

02

Smaller open-weight models leak over 50% of protected data.

03

POLAR-Bench localizes where models fail to follow privacy policies.

Abstract

LLM agents increasingly have access to private user data and act on the user's behalf when interacting with third-party systems. The user defines what may and must not be shared, and the agent must robustly follow that intent even when third-party systems behave adversarially. We introduce POLAR-Bench (Policy-aware adversarial Benchmark), in which a trusted model with a privacy policy and a task converses with a third-party model that adversarially probes for both task-relevant and protected attributes. Across 10 domains and 7,852 samples, we score privacy and utility by deterministic set-membership and vary privacy policy dimension and attack strategy along two orthogonal axes, producing a 5 times 5 diagnostic surface per model. Our results reveal a sharp split: current frontier models withhold over 99% of protected attributes, while smaller open-weight models in the 1--30B range, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.