A Girl Has A Name, And It's ... Adversarial Authorship Attribution for   Deobfuscation

Wanyue Zhai; Jonathan Rusert; Zubair Shafiq; Padmini Srinivasan

arXiv:2203.11849·cs.CL·March 23, 2022·1 cites

A Girl Has A Name, And It's ... Adversarial Authorship Attribution for Deobfuscation

Wanyue Zhai, Jonathan Rusert, Zubair Shafiq, Padmini Srinivasan

PDF

Open Access 1 Repo

TL;DR

This paper explores the challenge of adversarial authorship attribution, demonstrating that trained attributors can significantly reduce obfuscation effectiveness, highlighting the need for more robust privacy-preserving methods.

Contribution

It introduces the problem of adversarial deobfuscation in authorship attribution and evaluates how adversarial training impacts obfuscation effectiveness.

Findings

01

Adversarially trained attributors reduce obfuscation effectiveness from 20-30% to 5-10%.

02

Degradation in attribution accuracy occurs even when attributors have incorrect assumptions.

03

Current obfuscation methods are vulnerable to adversarial deobfuscation, requiring stronger defenses.

Abstract

Recent advances in natural language processing have enabled powerful privacy-invasive authorship attribution. To counter authorship attribution, researchers have proposed a variety of rule-based and learning-based text obfuscation approaches. However, existing authorship obfuscation approaches do not consider the adversarial threat model. Specifically, they are not evaluated against adversarially trained authorship attributors that are aware of potential obfuscation. To fill this gap, we investigate the problem of adversarial authorship attribution for deobfuscation. We show that adversarially trained authorship attributors are able to degrade the effectiveness of existing obfuscators from 20-30% to 5-10%. We also evaluate the effectiveness of adversarial training when the attributor makes incorrect assumptions about whether and which obfuscator was used. While there is a a clear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

reginazhai/authorship-deobfuscation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Topic Modeling

MethodsAttentive Walk-Aggregating Graph Neural Network