PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data

Juntao Tan; Liangwei Yang; Zuxin Liu; Zhiwei Liu; Rithesh Murthy; Tulika Manoj Awalgaonkar; Jianguo Zhang; Weiran Yao; Ming Zhu; Shirley Kokane; Silvio Savarese; Huan Wang; Caiming Xiong; Shelby Heinecke

arXiv:2502.20616·cs.AI·August 22, 2025

PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data

Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh Murthy, Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, Silvio Savarese, Huan Wang, Caiming Xiong, Shelby Heinecke

PDF

TL;DR

PersonaBench introduces a synthetic data benchmark to evaluate AI models' ability to understand and extract personal information from simulated private user data, revealing current limitations in personalization capabilities.

Contribution

We developed a synthetic data pipeline and benchmark, PersonaBench, to assess AI models' performance in understanding personal information from private data, addressing a key evaluation gap.

Findings

01

Current retrieval-augmented models struggle with private information questions

02

Synthetic data effectively simulates realistic user profiles

03

Highlights need for improved personalization methods in AI

Abstract

Personalization is critical in AI assistants, particularly in the context of private AI models that work with individual users. A key scenario in this domain involves enabling AI models to access and interpret a user's private data (e.g., conversation history, user-AI interactions, app usage) to understand personal details such as biographical information, preferences, and social connections. However, due to the sensitive nature of such data, there are no publicly available datasets that allow us to assess an AI model's ability to understand users through direct access to personal information. To address this gap, we introduce a synthetic data generation pipeline that creates diverse, realistic user profiles and private documents simulating human activities. Leveraging this synthetic data, we present PersonaBench, a benchmark designed to evaluate AI models' performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.