StyleRemix: Interpretable Authorship Obfuscation via Distillation and   Perturbation of Style Elements

Jillian Fisher; Skyler Hallinan; Ximing Lu; Mitchell Gordon; Zaid; Harchaoui; Yejin Choi

arXiv:2408.15666·cs.CL·August 29, 2024

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

Jillian Fisher, Skyler Hallinan, Ximing Lu, Mitchell Gordon, Zaid, Harchaoui, Yejin Choi

PDF

Open Access 1 Repo 10 Models 3 Datasets

TL;DR

StyleRemix is an interpretable authorship obfuscation method that perturbs specific style elements using pre-trained modules, outperforming larger models in robustness and controllability.

Contribution

The paper introduces StyleRemix, a novel, interpretable approach for authorship obfuscation that manipulates style features with low computational cost and provides new datasets for research.

Findings

01

Outperforms state-of-the-art baselines in style obfuscation tasks.

02

Maintains robustness while controlling stylistic features.

03

Achieves high performance with low computational cost.

Abstract

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall. To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite an input specifically along various stylistic axes (e.g., formality and length) while maintaining low computational cost. StyleRemix outperforms state-of-the-art baselines and much larger LLMs in a variety of domains as assessed by both automatic and human evaluation. Additionally, we release AuthorMix, a large set of 30K…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jfisher52/StyleRemix
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training