Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in   Customized Large Language Models

Zi Liang; Haibo Hu; Qingqing Ye; Yaxin Xiao; Haoyang Li

arXiv:2408.02416·cs.CL·February 13, 2025

Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models

Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Haoyang Li

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper investigates how prompts are leaked from large language models, analyzes the factors influencing prompt extraction, and proposes defense strategies to mitigate prompt leakage threats in prompt-based LLM services.

Contribution

It introduces a detailed analysis of prompt leakage mechanisms, explores key attributes affecting extraction, and develops effective defenses to protect prompt confidentiality in LLMs.

Findings

01

Prompt extraction is influenced by model size, prompt length, and prompt type.

02

Current LLMs, including GPT-4, are highly vulnerable to prompt extraction attacks.

03

Proposed defenses significantly reduce prompt extraction rates by over 70%.

Abstract

The drastic increase of large language models' (LLMs) parameters has led to a new research direction of fine-tuning-free downstream customization by prompts, i.e., task descriptions. While these prompt-based services (e.g. OpenAI's GPTs) play an important role in many businesses, there has emerged growing concerns about the prompt leakage, which undermines the intellectual properties of these services and causes downstream attacks. In this paper, we analyze the underlying mechanism of prompt leakage, which we refer to as prompt memorization, and develop corresponding defending strategies. By exploring the scaling laws in prompt extraction, we analyze key attributes that influence prompt extraction, including model sizes, prompt lengths, as well as the types of prompts. Then we propose two hypotheses that explain how LLMs expose their prompts. The first is attributed to the perplexity,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liangzid/promptextractioneval
pytorchOfficial

Datasets

liangzid/PEAD
dataset· 14 dl
14 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Layer · Attention Dropout · Label Smoothing · Residual Connection · Transformer