A General Pseudonymization Framework for Cloud-Based LLMs: Replacing   Privacy Information in Controlled Text Generation

Shilong Hou; Ruilin Shang; Zi Long; Xianghua Fu; Yin Chen

arXiv:2502.15233·cs.CR·February 24, 2025

A General Pseudonymization Framework for Cloud-Based LLMs: Replacing Privacy Information in Controlled Text Generation

Shilong Hou, Ruilin Shang, Zi Long, Xianghua Fu, Yin Chen

PDF

1 Repo

TL;DR

This paper introduces a comprehensive pseudonymization framework designed to protect user privacy during interactions with cloud-based large language models, balancing privacy and utility effectively.

Contribution

It presents the first detailed definition of a pseudonymization framework tailored for cloud-based LLMs, addressing privacy risks during inference.

Findings

01

Framework effectively balances privacy and utility

02

Experimental results validate privacy protection effectiveness

03

Code is publicly available for implementation

Abstract

An increasing number of companies have begun providing services that leverage cloud-based large language models (LLMs), such as ChatGPT. However, this development raises substantial privacy concerns, as users' prompts are transmitted to and processed by the model providers. Among the various privacy protection methods for LLMs, those implemented during the pre-training and fine-tuning phrases fail to mitigate the privacy risks associated with the remote use of cloud-based LLMs by users. On the other hand, methods applied during the inference phrase are primarily effective in scenarios where the LLM's inference does not rely on privacy-sensitive information. In this paper, we outline the process of remote user interaction with LLMs and, for the first time, propose a detailed definition of a general pseudonymization framework applicable to cloud-based LLMs. The experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mebymeby/pseudonymization-framework
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.