Metadata Integration for Spam Reviews Detection on Vietnamese E-commerce   Websites

Co Van Dinh; Son T. Luu

arXiv:2405.13292·cs.CL·August 2, 2024

Metadata Integration for Spam Reviews Detection on Vietnamese E-commerce Websites

Co Van Dinh, Son T. Luu

PDF

1 Repo

TL;DR

This paper introduces the ViSpamReviews v2 dataset and a novel integration of textual and categorical metadata for improved spam review detection on Vietnamese e-commerce sites, achieving state-of-the-art results.

Contribution

It presents a new dataset and a combined deep learning approach that leverages metadata and text features for more accurate spam review classification.

Findings

01

Product category enhances DNN performance.

02

Text features improve detection accuracy.

03

Achieved state-of-the-art results with PhoBERT and SPhoBert.

Abstract

The problem of detecting spam reviews (opinions) has received significant attention in recent years, especially with the rapid development of e-commerce. Spam reviews are often classified based on comment content, but in some cases, it is insufficient for models to accurately determine the review label. In this work, we introduce the ViSpamReviews v2 dataset, which includes metadata of reviews with the objective of integrating supplementary attributes for spam review classification. We propose a novel approach to simultaneously integrate both textual and categorical attributes into the classification model. In our experiments, the product category proved effective when combined with deep neural network (DNN) models, while text features performed well on both DNN models and the model achieved state-of-the-art performance in the problem of detecting spam reviews on Vietnamese e-commerce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sonlam1102/vispamdetection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.