Siamese Networks for Large-Scale Author Identification

Chakaveh Saedi; Mark Dras

arXiv:1912.10616·cs.CL·May 18, 2021

Siamese Networks for Large-Scale Author Identification

Chakaveh Saedi, Mark Dras

PDF

TL;DR

This paper explores the use of Siamese neural networks for large-scale authorship attribution, demonstrating they can outperform traditional methods by learning dynamic similarity measures.

Contribution

It introduces Siamese networks for authorship attribution with large candidate pools, extending their use beyond small, closed-class classification tasks.

Findings

01

Siamese networks outperform previous similarity-based methods.

02

Different energy functions and architectures impact performance.

03

Effective for large-scale authorship identification.

Abstract

Authorship attribution is the process of identifying the author of a text. Approaches to tackling it have been conventionally divided into classification-based ones, which work well for small numbers of candidate authors, and similarity-based methods, which are applicable for larger numbers of authors or for authors beyond the training set; these existing similarity-based methods have only embodied static notions of similarity. Deep learning methods, which blur the boundaries between classification-based and similarity-based approaches, are promising in terms of ability to learn a notion of similarity, but have previously only been used in a conventional small-closed-class classification setup. Siamese networks have been used to develop learned notions of similarity in one-shot image tasks, and also for tasks of mostly semantic relatedness in NLP. We examine their application to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.