Were You Helpful -- Predicting Helpful Votes from Amazon Reviews
Emin Kirimlioglu, Harrison Kung, Dominic Orlando

TL;DR
This study develops a machine learning model that predicts Amazon review helpfulness primarily based on metadata features like images, reviewer history, and timing, highlighting the importance of user behavior over text content.
Contribution
The paper introduces a predictive model emphasizing metadata features for helpfulness, demonstrating their superior correlation over natural language processing approaches.
Findings
Metadata features significantly predict helpfulness
Images and reviewer history are strong indicators
Content analysis was less effective than metadata-based models
Abstract
This project investigates factors that influence the perceived helpfulness of Amazon product reviews through machine learning techniques. After extensive feature analysis and correlation testing, we identified key metadata characteristics that serve as strong predictors of review helpfulness. While we initially explored natural language processing approaches using TextBlob for sentiment analysis, our final model focuses on metadata features that demonstrated more significant correlations, including the number of images per review, reviewer's historical helpful votes, and temporal aspects of the review. The data pipeline encompasses careful preprocessing and feature standardization steps to prepare the input for model training. Through systematic evaluation of different feature combinations, we discovered that metadata elements we choose using a threshold provide reliable signals when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting
