Dye4AI: Assuring Data Boundary on Generative AI Services

Shu Wang; Kun Sun; Yan Zhai

arXiv:2406.14114·cs.CR·June 21, 2024

Dye4AI: Assuring Data Boundary on Generative AI Services

Shu Wang, Kun Sun, Yan Zhai

PDF

TL;DR

Dye4AI is a testing system designed to verify data boundaries in generative AI models by injecting and retrieving crafted triggers during human-AI interactions, ensuring data privacy and security.

Contribution

The paper introduces a novel dye testing framework with a unique trigger design and conversation strategy to detect data leakage in various large language models.

Findings

01

Effective in detecting data leakage across six LLMs

02

Larger models like OpenLLaMa-13B are more suitable for dye testing

03

Prompt selection impacts trigger recovery success

Abstract

Generative artificial intelligence (AI) is versatile for various applications, but security and privacy concerns with third-party AI vendors hinder its broader adoption in sensitive scenarios. Hence, it is essential for users to validate the AI trustworthiness and ensure the security of data boundaries. In this paper, we present a dye testing system named Dye4AI, which injects crafted trigger data into human-AI dialogue and observes AI responses towards specific prompts to diagnose data flow in AI model evolution. Our dye testing procedure contains 3 stages: trigger generation, trigger insertion, and trigger retrieval. First, to retain both uniqueness and stealthiness, we design a new trigger that transforms a pseudo-random number to a intelligible format. Second, with a custom-designed three-step conversation strategy, we insert each trigger item into dialogue and confirm the model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.