Detecting Sockpuppets in Deceptive Opinion Spam
Marjan Hosseinia, Arjun Mukherjee

TL;DR
This paper presents novel methods for detecting sockpuppets in deceptive opinion spam by leveraging stylistic features and unlabeled data, improving accuracy in authorship verification tasks.
Contribution
It introduces a feature subsampling scheme based on KL-Divergence and a spy induction transduction method for sockpuppet detection, advancing authorship attribution techniques.
Findings
Feature subsampling improves discriminative power.
Spy induction effectively retrieves hidden sockpuppet samples.
Methods outperform baseline approaches.
Abstract
This paper explores the problem of sockpuppet detection in deceptive opinion spam using authorship attribution and verification approaches. Two methods are explored. The first is a feature subsampling scheme that uses the KL-Divergence on stylistic language models of an author to find discriminative features. The second is a transduction scheme, spy induction that leverages the diversity of authors in the unlabeled test set by sending a set of spies (positive samples) from the training set to retrieve hidden samples in the unlabeled test set using nearest and farthest neighbors. Experiments using ground truth sockpuppet data show the effectiveness of the proposed schemes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Authorship Attribution and Profiling · Hate Speech and Cyberbullying Detection
