# Ranking Online Consumer Reviews

**Authors:** Sunil Saumya, Jyoti Prakash Singh, Abdullah Mohammed Baabdullah,, Nripendra P. Rana, Yogesh k. Dwivedi

arXiv: 1901.06274 · 2019-01-21

## TL;DR

This paper proposes a system to rank online consumer reviews by predicting their helpfulness scores using machine learning, improving review visibility and aiding consumers in decision-making.

## Contribution

The study introduces a novel review ranking system that combines features from review text, product descriptions, and Q&A data, utilizing random forest and gradient boosting models.

## Key findings

- High-quality reviews are effectively ranked in top positions.
- Inclusion of product description and Q&A features improves prediction accuracy.
- 3-4 new high-quality reviews appear in top ten reviews after ranking.

## Abstract

The product reviews are posted online in the hundreds and even in the thousands for some popular products. Handling such a large volume of continuously generated online content is a challenging task for buyers, sellers, and even researchers. The purpose of this study is to rank the overwhelming number of reviews using their predicted helpfulness score. The helpfulness score is predicted using features extracted from review text data, product description data and customer question-answer data of a product using random-forest classifier and gradient boosting regressor. The system is made to classify the reviews into low or high quality by random-forest classifier. The helpfulness score of the high-quality reviews is only predicted using gradient boosting regressor. The helpfulness score of the low-quality reviews is not calculated because they are never going to be in the top k reviews. They are just added at the end of the review list to the review-listing website. The proposed system provides fair review placement on review listing pages and making all high-quality reviews visible to customers on the top. The experimental results on data from two popular Indian e-commerce websites validate our claim, as 3-4 new high-quality reviews are placed in the top ten reviews along with 5-6 old reviews based on review helpfulness. Our findings indicate that inclusion of features from product description data and customer question-answer data improves the prediction accuracy of the helpfulness score.

---
Source: https://tomesphere.com/paper/1901.06274