Social Media Writing Style Fingerprint

Himank Yadav; Juliang Li

arXiv:1712.04762·cs.CL·December 27, 2017

Social Media Writing Style Fingerprint

Himank Yadav, Juliang Li

PDF

Open Access

TL;DR

This paper introduces a neural network-inspired approach for social media text authorship attribution using word and character-level models, achieving high accuracy by considering writing bias.

Contribution

It presents a novel hybrid model combining word and character-level features with validation-driven model selection for social media authorship attribution.

Findings

01

Achieved 0.82 precision in authorship attribution

02

Recall of 0.926 demonstrates high detection capability

03

F-measure of 0.869 indicates balanced performance

Abstract

We present our approach for computer-aided social media text authorship attribution based on recent advances in short text authorship verification. We use various natural language techniques to create word-level and character-level models that act as hidden layers to simulate a simple neural network. The choice of word-level and character-level models in each layer was informed through validation performance. The output layer of our system uses an unweighted majority vote vector to arrive at a conclusion. We also considered writing bias in social media posts while collecting our training dataset to increase system robustness. Our system achieved a precision, recall, and F-measure of 0.82, 0.926 and 0.869 respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Topic Modeling