Protecting Privacy in Classifiers by Token Manipulation
Re'em Harel, Yair Elboher, Yuval Pinter

TL;DR
This paper investigates text manipulation techniques to protect user privacy in language model classification tasks, aiming to prevent data exposure while maintaining classifier accuracy.
Contribution
It introduces and compares token mapping and contextualized manipulation methods for text privacy, highlighting their impact on accuracy and reconstructability.
Findings
Token mapping functions can be easily implemented but are vulnerable to reconstruction.
Contextualized manipulation improves classifier performance and privacy.
Some methods allow data recovery by sophisticated attackers.
Abstract
Using language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the prospects of avoiding such data exposure at the level of text manipulation. We focus on text classification models, examining various token mapping and contextualized manipulation functions in order to see whether classifier accuracy may be maintained while keeping the original text unrecoverable. We find that although some token mapping functions are easy and straightforward to implement, they heavily influence performance on the downstream task, and via a sophisticated attacker can be reconstructed. In comparison, the contextualized manipulation provides an improvement in performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
Methodstravel james · Focus
