The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations
Jeremiah Duncan, Fabian Fallas, Chris Gropp, Emily Herron, Maria, Mahbub, Paula Olaya, Eduardo Ponce, Tabitha K. Samuel, Daniel Schultz,, Sudarshan Srinivasan, Maofeng Tang, Viktor Zenkov, Quan Zhou, Edmon Begoli

TL;DR
This paper investigates how word embedding-based author detection models are affected by semantic-preserving adversarial perturbations, revealing their sensitivities and limitations in maintaining accuracy under input manipulations.
Contribution
It introduces an experimental framework to evaluate the robustness of authorship detection models against semantic-preserving adversarial attacks.
Findings
Detection performance drops significantly with certain perturbations
Model sensitivity varies based on input and configuration
Different perturbation strategies have distinct impacts on accuracy
Abstract
Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sources of misinformation, etc. The focus of this paper is to explore the limitations and sensitiveness of established approaches to adversarial manipulations of inputs. To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations. Next, we experimentally evaluated the performance of the authorship detection model to a collection of semantic-preserving adversarial perturbations of input narratives. Finally, we compare and analyze the effects of different perturbation strategies, input and model configurations, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling · Spam and Phishing Detection
