CI-Bench: Benchmarking Contextual Integrity of AI Assistants on   Synthetic Data

Zhao Cheng; Diane Wan; Matthew Abueg; Sahra Ghalebikesabi; Ren Yi,; Eugene Bagdasarian; Borja Balle; Stefan Mellem; Shawn O'Banion

arXiv:2409.13903·cs.AI·September 24, 2024

CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Zhao Cheng, Diane Wan, Matthew Abueg, Sahra Ghalebikesabi, Ren Yi,, Eugene Bagdasarian, Borja Balle, Stefan Mellem, Shawn O'Banion

PDF

Open Access

TL;DR

CI-Bench is a synthetic benchmark designed to evaluate AI assistants' ability to protect personal information during inference, using a scalable data pipeline and the Contextual Integrity framework to assess information flow across context dimensions.

Contribution

The paper introduces a novel, scalable synthetic data pipeline and a comprehensive benchmark for evaluating privacy-preserving capabilities of AI assistants based on Contextual Integrity.

Findings

01

Demonstrated the benchmark's ability to evaluate information flow

02

Showcased the need for careful training of AI assistants for privacy

03

Provided a baseline naive AI assistant for comparison

Abstract

Advances in generative AI point towards a new era of personalized applications that perform diverse tasks on behalf of users. While general AI assistants have yet to fully emerge, their potential to share personal data raises significant privacy challenges. This paper introduces CI-Bench, a comprehensive synthetic benchmark for evaluating the ability of AI assistants to protect personal information during model inference. Leveraging the Contextual Integrity framework, our benchmark enables systematic assessment of information flow across important context dimensions, including roles, information types, and transmission principles. We present a novel, scalable, multi-step synthetic data pipeline for generating natural communications, including dialogues and emails. Unlike previous work with smaller, narrowly focused evaluations, we present a novel, scalable, multi-step data pipeline that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · AI in Service Interactions

MethodsALIGN