Determining Individual Origin Similarity (DInOS): Binary Classification   of Authors Using Stylometric Features

A. Kingsland; D. Fortin; E. Cary; S. Smith; K. Pazdernik; and R. Perko

arXiv:1912.03750·cs.SI·December 10, 2019

Determining Individual Origin Similarity (DInOS): Binary Classification of Authors Using Stylometric Features

A. Kingsland, D. Fortin, E. Cary, S. Smith, K. Pazdernik, and R. Perko

PDF

Open Access

TL;DR

This paper introduces a stylometric-based binary classification method called DInOS for identifying author similarity, achieving high accuracy and aiding in the detection of disinformation campaigns on social media.

Contribution

The study adapts stylometric features for author similarity detection and demonstrates their high performance across machine learning and deep learning models.

Findings

01

Achieved >0.96 F-1 score in author classification

02

Effective use of stylometric features for author similarity

03

Potential to improve disinformation campaign detection

Abstract

Author similarity and detection is an integral first step in detecting state-led disinformation campaigns in an automated fashion. Current detection techniques require an analyst or subject matter expert to hand-curate accounts. Stylometric features have a rich history in identifying authorship of unknown documents, but little exploration has been done to compare authors to one another. We have adapted a select handful of stylometric features for use in author similarity metrics, and show their >0.96 F-1 performance on a curated author classification task, across both traditional machine learning and deep learning models. These features should contribute to the expanding field of author similarity research, and expedite the process of detecting and mitigating large-scale social media disinformation campaigns.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Text Readability and Simplification · Topic Modeling