On the Impact of Noise in Differentially Private Text Rewriting

Stephen Meisenbacher; Maulik Chevli; and Florian Matthes

arXiv:2501.19022·cs.CL·February 3, 2025

On the Impact of Noise in Differentially Private Text Rewriting

Stephen Meisenbacher, Maulik Chevli, and Florian Matthes

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how noise affects the utility and privacy in differential privacy-based text rewriting, introducing a new sentence infilling method and comparing it with non-DP approaches.

Contribution

It introduces a novel sentence infilling privatization technique and empirically analyzes the impact of noise in DP text rewriting compared to non-DP methods.

Findings

01

Non-DP methods better preserve utility.

02

DP methods offer stronger privacy protections.

03

Noise significantly impacts the effectiveness of DP in NLP.

Abstract

The field of text privatization often leverages the notion of $Differential Privacy$ (DP) to provide formal guarantees in the rewriting or obfuscation of sensitive textual data. A common and nearly ubiquitous form of DP application necessitates the addition of calibrated noise to vector representations of text, either at the data- or model-level, which is governed by the privacy parameter $ε$ . However, noise addition almost undoubtedly leads to considerable utility loss, thereby highlighting one major drawback of DP in NLP. In this work, we introduce a new sentence infilling privatization technique, and we use this method to explore the effect of noise in DP text rewriting. We empirically demonstrate that non-DP privatization techniques excel in utility preservation and can find an acceptable empirical privacy-utility trade-off, yet cannot outperform DP methods in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sjmeis/privfill
pytorchOfficial

Videos

On the Impact of Noise in Differentially Private Text Rewriting· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · semigroups and automata theory · DNA and Biological Computing