Siamese Networks for Large-Scale Author Identification
Chakaveh Saedi, Mark Dras

TL;DR
This paper explores the use of Siamese neural networks for large-scale authorship attribution, demonstrating they can outperform traditional methods by learning dynamic similarity measures.
Contribution
It introduces Siamese networks for authorship attribution with large candidate pools, extending their use beyond small, closed-class classification tasks.
Findings
Siamese networks outperform previous similarity-based methods.
Different energy functions and architectures impact performance.
Effective for large-scale authorship identification.
Abstract
Authorship attribution is the process of identifying the author of a text. Approaches to tackling it have been conventionally divided into classification-based ones, which work well for small numbers of candidate authors, and similarity-based methods, which are applicable for larger numbers of authors or for authors beyond the training set; these existing similarity-based methods have only embodied static notions of similarity. Deep learning methods, which blur the boundaries between classification-based and similarity-based approaches, are promising in terms of ability to learn a notion of similarity, but have previously only been used in a conventional small-closed-class classification setup. Siamese networks have been used to develop learned notions of similarity in one-shot image tasks, and also for tasks of mostly semantic relatedness in NLP. We examine their application to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
