ConfusionPrompt: Practical Private Inference for Online Large Language Models

Peihua Mai; Youjia Yang; Ran Yan; Rui Ye; and Yan Pang

arXiv:2401.00870·cs.CR·April 9, 2026·1 cites

ConfusionPrompt: Practical Private Inference for Online Large Language Models

Peihua Mai, Youjia Yang, Ran Yan, Rui Ye, and Yan Pang

PDF

TL;DR

ConfusionPrompt is a privacy-preserving framework for online large language model inference that decomposes prompts and uses pseudo-prompts to protect user data while maintaining high utility.

Contribution

It introduces a novel prompt decomposition and pseudo-prompt generation method that enhances privacy and utility in LLM inference, compatible with black-box models.

Findings

01

Achieves higher utility than local inference and perturbation methods.

02

Reduces memory consumption compared to open-source LLMs.

03

Provides a formal privacy model and complexity analysis.

Abstract

State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. This raises significant privacy concerns. In response, we introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by: (i) decomposing the original prompt into smaller sub-prompts, and (ii) generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM. The server responses are later recomposed by the user to reconstruct the final output. This approach offers key advantages over previous LLM privacy protection methods: (i) it integrates seamlessly with existing black-box LLMs, and (ii) it delivers a significantly improved privacy-utility trade-off compared to existing text perturbation methods. We also develop a $(λ, μ, ρ)$ -privacy model to formulate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.