STAMP: Selective Task-Aware Mechanism for Text Privacy
Fengwei Tian, Payel Bhattacharjee, Heidi Hanson, Geoffrey D. Rubin, Joseph Y. Lo, Ravi Tandon

TL;DR
STAMP introduces a task-aware text privatization framework that selectively allocates privacy noise at the token level, improving the balance between privacy and utility by considering token importance and sensitivity.
Contribution
The paper proposes STAMP, a novel framework that allocates privacy budgets at the token level based on importance and sensitivity, and introduces the polar mechanism for embedding perturbation.
Findings
Outperforms existing methods in privacy-utility trade-offs
Maintains semantic neighborhoods better than isotropic noise mechanisms
Effective across multiple datasets like SQuAD, Yelp, and AG News
Abstract
We present STAMP (Selective Task-Aware Mechanism for Text Privacy), a new framework for task-aware text privatization that achieves an improved privacy-utility trade-off. STAMP selectively allocates privacy budgets across tokens by jointly considering (i) each token's importance to the downstream task (as measured via a task- or query-specific representation), and (ii) its privacy sensitivity (e.g., names, dates, identifiers). This token-level partitioning enables fine-grained, group-wise control over the level of noise applied to different parts of the input, balancing privacy protection with task relevance. To privatize individual token embeddings, we introduce the polar mechanism, which perturbs only the direction of embeddings on the unit sphere while preserving their magnitude. Decoding is performed via cosine nearest-neighbor search, aligning the perturbation geometry with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Privacy, Security, and Data Protection
