Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy
Weijun Li, Arnaud Grivet S\'ebert, Qiongkai Xu, Annabelle McIver, Mark Dras

TL;DR
This paper introduces TeDA, an empirical calibration method for text rewriting under Local Differential Privacy, enabling better comparison of privacy guarantees beyond nominal epsilon values.
Contribution
We propose TeDA, a hypothesis-testing framework for empirically calibrating privacy loss in LDP text rewriting mechanisms, improving interpretability and comparison.
Findings
Empirical calibration reveals differences in privacy levels not captured by nominal epsilon.
TeDA enables practical assessment of privacy-utility trade-offs in real-world deployments.
Different mechanisms with similar epsilon can have significantly different distinguishability.
Abstract
The growing use of large language models has increased interest in sharing textual data in a privacy-preserving manner. One prominent line of work addresses this challenge through text rewriting under Local Differential Privacy (LDP), where input texts are locally obfuscated before release with formal privacy guarantees. These guarantees are typically expressed by a parameter that upper bounds the worst-case privacy loss. However, nominal values are often difficult to interpret and compare across mechanisms. In this work, we investigate how to empirically calibrate across text rewriting mechanisms under LDP. We propose TeDA, which formulates calibration via a hypothesis-testing framework that instantiates text distinguishability audits in both surface and embedding spaces, enabling empirical assessment of indistinguishability from privatized texts. Applying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Cryptography and Data Security
