Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness
Fatemehsadat Mireshghallah, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces a VAE-based style transfer framework that obfuscates stylistic features in text to enhance fairness in classification tasks, balancing minimal and maximal style obfuscation strategies.
Contribution
It proposes a novel style pooling method for text obfuscation using VAEs, enabling flexible style transfer to improve classifier fairness.
Findings
Effective style obfuscation improves fairness in classification.
The framework maintains fluency and semantic consistency.
Style pooling impacts attribute removal and text quality.
Abstract
Text style can reveal sensitive attributes of the author (e.g. race or age) to the reader, which can, in turn, lead to privacy violations and bias in both human and algorithmic decisions based on text. For example, the style of writing in job applications might reveal protected attributes of the candidate which could lead to bias in hiring decisions, regardless of whether hiring decisions are made algorithmically or by humans. We propose a VAE-based framework that obfuscates stylistic features of human-generated text through style transfer by automatically re-writing the text itself. Our framework operationalizes the notion of obfuscated style in a flexible way that enables two distinct notions of obfuscated style: (1) a minimal notion that effectively intersects the various styles seen in training, and (2) a maximal notion that seeks to obfuscate by adding stylistic features of all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Imbalanced Data Classification Techniques · Hate Speech and Cyberbullying Detection
