Opinion Spam Detection: A New Approach Using Machine Learning and   Network-Based Algorithms

Kiril Danilchenko; Michael Segal; Dan Vilenchik

arXiv:2205.13422·cs.LG·May 27, 2022

Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms

Kiril Danilchenko, Michael Segal, Dan Vilenchik

PDF

Open Access

TL;DR

This paper introduces a novel approach combining machine learning and network-based algorithms to detect opinion spam in online reviews, effectively addressing the challenge of limited labeled data.

Contribution

It proposes a new classification method that leverages user graph structures and active learning to improve spam detection accuracy with scarce labeled data.

Findings

01

Outperforms existing active learning methods

02

Requires fewer labeled samples for effective detection

03

Achieves higher accuracy on real-world datasets

Abstract

E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammers automatically, including the absence of reliable labeled data. This limitation precludes an off-the-shelf application of a machine learning pipeline. We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm that capitalizes on the users' graph structure to compensate for the possible scarcity of labeled data. We devise a new way of sampling the labels for the training step (active learning), replacing the typical uniform sampling. Experiments on three large…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Network Security and Intrusion Detection · Sentiment Analysis and Opinion Mining

Methodstravel james