Explainable Authorship Verification in Social Media via Attention-based   Similarity Learning

Benedikt Boenninghoff; Steffen Hessler; Dorothea Kolossa; Robert M.; Nickel

arXiv:1910.08144·cs.CL·November 21, 2019

Explainable Authorship Verification in Social Media via Attention-based Similarity Learning

Benedikt Boenninghoff, Steffen Hessler, Dorothea Kolossa, Robert M., Nickel

PDF

3 Repos

TL;DR

This paper introduces an attention-based neural network approach for authorship verification in social media texts, outperforming traditional stylometric methods and providing interpretability through visualization of decision factors.

Contribution

It extends a hierarchical Siamese neural network to learn neural features and visualize decision processes, specifically tailored for short, varied social media texts.

Findings

01

Siamese network outperforms stylometric approaches

02

The model effectively captures linguistic features

03

Attention weights align with linguistic categories

Abstract

Authorship verification is the task of analyzing the linguistic patterns of two or more texts to determine whether they were written by the same author or not. The analysis is traditionally performed by experts who consider linguistic features, which include spelling mistakes, grammatical inconsistencies, and stylistics for example. Machine learning algorithms, on the other hand, can be trained to accomplish the same, but have traditionally relied on so-called stylometric features. The disadvantage of such features is that their reliability is greatly diminished for short and topically varied social media texts. In this interdisciplinary work, we propose a substantial extension of a recently published hierarchical Siamese neural network approach, with which it is feasible to learn neural features and to visualize the decision-making process. For this purpose, a new large-scale corpus of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSiamese Network