TL;DR
This paper introduces an attention-based neural network approach for authorship verification in social media texts, outperforming traditional stylometric methods and providing interpretability through visualization of decision factors.
Contribution
It extends a hierarchical Siamese neural network to learn neural features and visualize decision processes, specifically tailored for short, varied social media texts.
Findings
Siamese network outperforms stylometric approaches
The model effectively captures linguistic features
Attention weights align with linguistic categories
Abstract
Authorship verification is the task of analyzing the linguistic patterns of two or more texts to determine whether they were written by the same author or not. The analysis is traditionally performed by experts who consider linguistic features, which include spelling mistakes, grammatical inconsistencies, and stylistics for example. Machine learning algorithms, on the other hand, can be trained to accomplish the same, but have traditionally relied on so-called stylometric features. The disadvantage of such features is that their reliability is greatly diminished for short and topically varied social media texts. In this interdisciplinary work, we propose a substantial extension of a recently published hierarchical Siamese neural network approach, with which it is feasible to learn neural features and to visualize the decision-making process. For this purpose, a new large-scale corpus of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSiamese Network
