Quantifying Misattribution Unfairness in Authorship Attribution
Pegah Alipoormolabashi, Ajay Patel, Niranjan Balasubramanian

TL;DR
This paper introduces a new fairness measure called MAUIk to quantify how often authors are incorrectly ranked as potential authors of texts they did not write, revealing high unfairness levels in existing models and highlighting risks for certain authors.
Contribution
The paper proposes the Misattribution Unfairness Index (MAUIk) to evaluate fairness in authorship attribution models and analyzes how embedding proximity affects misattribution risks.
Findings
All models show high unfairness levels.
Authors closer to the embedding centroid face higher misattribution risk.
Unfairness correlates with authors' position in the embedding space.
Abstract
Authorship misattribution can have profound consequences in real life. In forensic settings simply being considered as one of the potential authors of an evidential piece of text or communication can result in undesirable scrutiny. This raises a fairness question: Is every author in the candidate pool at equal risk of misattribution? Standard evaluation measures for authorship attribution systems do not explicitly account for this notion of fairness. We introduce a simple measure, Misattribution Unfairness Index (MAUIk), which is based on how often authors are ranked in the top k for texts they did not write. Using this measure we quantify the unfairness of five models on two different datasets. All models exhibit high levels of unfairness with increased risks for some authors. Furthermore, we find that this unfairness relates to how the models embed the authors as vectors in the latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAuthorship Attribution and Profiling · Names, Identity, and Discrimination Research · Hate Speech and Cyberbullying Detection
