Were You Helpful -- Predicting Helpful Votes from Amazon Reviews

Emin Kirimlioglu; Harrison Kung; Dominic Orlando

arXiv:2412.02884·cs.NE·December 5, 2024

Were You Helpful -- Predicting Helpful Votes from Amazon Reviews

Emin Kirimlioglu, Harrison Kung, Dominic Orlando

PDF

Open Access

TL;DR

This study develops a machine learning model that predicts Amazon review helpfulness primarily based on metadata features like images, reviewer history, and timing, highlighting the importance of user behavior over text content.

Contribution

The paper introduces a predictive model emphasizing metadata features for helpfulness, demonstrating their superior correlation over natural language processing approaches.

Findings

01

Metadata features significantly predict helpfulness

02

Images and reviewer history are strong indicators

03

Content analysis was less effective than metadata-based models

Abstract

This project investigates factors that influence the perceived helpfulness of Amazon product reviews through machine learning techniques. After extensive feature analysis and correlation testing, we identified key metadata characteristics that serve as strong predictors of review helpfulness. While we initially explored natural language processing approaches using TextBlob for sentiment analysis, our final model focuses on metadata features that demonstrated more significant correlations, including the number of images per review, reviewer's historical helpful votes, and temporal aspects of the review. The data pipeline encompasses careful preprocessing and feature standardization steps to prepare the input for model training. Through systematic evaluation of different feature combinations, we discovered that metadata elements we choose using a threshold provide reliable signals when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting