TL;DR
This paper introduces a comprehensive pseudonymization framework designed to protect user privacy during interactions with cloud-based large language models, balancing privacy and utility effectively.
Contribution
It presents the first detailed definition of a pseudonymization framework tailored for cloud-based LLMs, addressing privacy risks during inference.
Findings
Framework effectively balances privacy and utility
Experimental results validate privacy protection effectiveness
Code is publicly available for implementation
Abstract
An increasing number of companies have begun providing services that leverage cloud-based large language models (LLMs), such as ChatGPT. However, this development raises substantial privacy concerns, as users' prompts are transmitted to and processed by the model providers. Among the various privacy protection methods for LLMs, those implemented during the pre-training and fine-tuning phrases fail to mitigate the privacy risks associated with the remote use of cloud-based LLMs by users. On the other hand, methods applied during the inference phrase are primarily effective in scenarios where the LLM's inference does not rely on privacy-sensitive information. In this paper, we outline the process of remote user interaction with LLMs and, for the first time, propose a detailed definition of a general pseudonymization framework applicable to cloud-based LLMs. The experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
