Just Rewrite It Again: A Post-Processing Method for Enhanced Semantic   Similarity and Privacy Preservation of Differentially Private Rewritten Text

Stephen Meisenbacher; Florian Matthes

arXiv:2405.19831·cs.CL·June 3, 2024

Just Rewrite It Again: A Post-Processing Method for Enhanced Semantic Similarity and Privacy Preservation of Differentially Private Rewritten Text

Stephen Meisenbacher, Florian Matthes

PDF

Open Access

TL;DR

This paper introduces a post-processing technique that rewrites differentially private texts again to improve semantic similarity to the original and enhance privacy protection against adversaries.

Contribution

It proposes a simple re-rewriting method that boosts both semantic fidelity and empirical privacy in DP text rewriting tasks.

Findings

01

Rewritten texts become more semantically similar to originals.

02

Empirical privacy scores improve with the re-rewriting process.

03

The method adds an extra layer of privacy protection.

Abstract

The study of Differential Privacy (DP) in Natural Language Processing often views the task of text privatization as a $rewriting$ task, in which sensitive input texts are rewritten to hide explicit or implicit private information. In order to evaluate the privacy-preserving capabilities of a DP text rewriting mechanism, $empirical privacy$ tests are frequently employed. In these tests, an adversary is modeled, who aims to infer sensitive information (e.g., gender) about the author behind a (privatized) text. Looking to improve the empirical protections provided by DP rewriting methods, we propose a simple post-processing method based on the goal of aligning rewritten texts with their original counterparts, where DP rewritten texts are rewritten $again$ . Our results show that such an approach not only produces outputs that are more semantically reminiscent of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital and Cyber Forensics · Privacy-Preserving Technologies in Data