Differentially-Private Text Rewriting reshapes Linguistic Style
Stefan Arnold

TL;DR
This paper investigates how differential privacy applied to text rewriting affects linguistic style, revealing that privacy-preserving methods significantly alter stylistic and functional aspects of language.
Contribution
It provides a multidimensional analysis of stylistic changes caused by differentially-private text rewriting, highlighting the loss of interactive and contextual markers.
Findings
Privacy-preserving rewriting reduces stylistic diversity.
Both autoregressive and bidirectional methods converge to a neutral register.
Semantic content is preserved despite stylistic homogenization.
Abstract
Differential Privacy (DP) for text matured from disjointed word-level substitutions to contiguous sentence-level rewriting by leveraging the generative capacity of language models. While this form of text privatization is best suited for balancing formal privacy guarantees with grammatical coherence, its impact on the register identity of text remains largely unexplored. By conducting a multidimensional stylistic profiling of differentially-private rewriting, we demonstrate that the cost of privacy extends far beyond lexical variation. Specifically, we find that rewriting under privacy constraints induces a systematic functional mutation of the text's communicative signature. This shift is characterized by the severe attrition of interactive markers, contextual references, and complex subordination. By comparing autoregressive paraphrasing against bidirectional substitution across a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
