Personalized Author Obfuscation with Large Language Models
Mohammad Shokri, Sarah Ita Levitan, Rivka Levitan

TL;DR
This paper explores using large language models to obfuscate authorship by paraphrasing, revealing that personalized prompting improves effectiveness and reduces variability across individual users.
Contribution
It introduces a personalized prompting technique for LLM-based author obfuscation, addressing variability in effectiveness across users.
Findings
LLMs are generally effective at author obfuscation.
Performance varies significantly across individual users.
Personalized prompting improves obfuscation success and reduces bimodal efficacy distribution.
Abstract
In this paper, we investigate the efficacy of large language models (LLMs) in obfuscating authorship by paraphrasing and altering writing styles. Rather than adopting a holistic approach that evaluates performance across the entire dataset, we focus on user-wise performance to analyze how obfuscation effectiveness varies across individual authors. While LLMs are generally effective, we observe a bimodal distribution of efficacy, with performance varying significantly across users. To address this, we propose a personalized prompting method that outperforms standard prompting techniques and partially mitigates the bimodality issue.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Topic Modeling
MethodsFocus
