Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

Junlin Wang; Tianyi Yang; Roy Xie; Bhuwan Dhingra

arXiv:2406.06737·cs.CR·October 29, 2024

Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

Junlin Wang, Tianyi Yang, Roy Xie, Bhuwan Dhingra

PDF

Open Access 1 Repo

TL;DR

The paper introduces Raccoon, a comprehensive benchmark for evaluating the vulnerability of LLMs to prompt extraction attacks, including diverse attack types and defenses, to improve robustness assessment.

Contribution

It presents the first extensive benchmark evaluating LLM susceptibility to prompt theft, with novel dual-scenario assessment and a wide range of attack and defense strategies.

Findings

01

Models are generally vulnerable without defenses.

02

OpenAI models show resilience with proper defenses.

03

The benchmark covers 14 attack categories and defense mechanisms.

Abstract

With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed queries. To help mitigate this problem, we introduce the Raccoon benchmark which comprehensively evaluates a model's susceptibility to prompt extraction attacks. Our novel evaluation method assesses models under both defenseless and defended scenarios, employing a dual approach to evaluate the effectiveness of existing defenses and the resilience of the models. The benchmark encompasses 14 categories of prompt extraction attacks, with additional compounded attacks that closely mimic the strategies of potential attackers, alongside a diverse collection of defense templates. This array is, to our knowledge, the most extensive compilation of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

m0gician/raccoonbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems · Advanced Data Processing Techniques · Neural Networks and Applications