StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements
Jillian Fisher, Skyler Hallinan, Ximing Lu, Mitchell Gordon, Zaid, Harchaoui, Yejin Choi

TL;DR
StyleRemix is an interpretable authorship obfuscation method that perturbs specific style elements using pre-trained modules, outperforming larger models in robustness and controllability.
Contribution
The paper introduces StyleRemix, a novel, interpretable approach for authorship obfuscation that manipulates style features with low computational cost and provides new datasets for research.
Findings
Outperforms state-of-the-art baselines in style obfuscation tasks.
Maintains robustness while controlling stylistic features.
Achieves high performance with low computational cost.
Abstract
Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall. To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite an input specifically along various stylistic axes (e.g., formality and length) while maintaining low computational cost. StyleRemix outperforms state-of-the-art baselines and much larger LLMs in a variety of domains as assessed by both automatic and human evaluation. Additionally, we release AuthorMix, a large set of 30K…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗hallisky/lora-sarcasm-more-llama-3-8bmodel· 20 dl20 dl
- 🤗hallisky/lora-function-more-llama-3-8bmodel· 19 dl19 dl
- 🤗hallisky/lora-formality-formal-llama-3-8bmodel· 19 dl19 dl
- 🤗hallisky/lora-type-persuasive-llama-3-8bmodel· 19 dl19 dl
- 🤗hallisky/lora-length-long-llama-3-8bmodel· 23 dl23 dl
- 🤗hallisky/lora-formality-informal-llama-3-8bmodel· 18 dl18 dl
- 🤗hallisky/lora-voice-active-llama-3-8bmodel· 19 dl19 dl
- 🤗hallisky/lora-type-expository-llama-3-8bmodel· 19 dl19 dl
- 🤗hallisky/lora-grade-highschool-llama-3-8bmodel· 18 dl18 dl
- 🤗hallisky/lora-voice-passive-llama-3-8bmodel· 18 dl18 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training
