Identifying leading indicators of product recalls from online reviews using positive unlabeled learning and domain adaptation
Shreesh Kumara Bhat, Aron Culotta

TL;DR
This paper presents a novel method combining positive unlabeled learning and domain adaptation to identify potential hazardous products from online reviews, providing early warnings before official recalls.
Contribution
It introduces a new approach for mining online reviews to detect safety hazards, addressing the scarcity of labeled data with domain adaptation techniques.
Findings
Achieved an 8% absolute F1 score improvement over baselines.
Identified safety hazard reviews prior to recall for 45% of known recalled products.
Demonstrated potential for early warning of hazardous products.
Abstract
Consumer protection agencies are charged with safeguarding the public from hazardous products, but the thousands of products under their jurisdiction make it challenging to identify and respond to consumer complaints quickly. From the consumer's perspective, online reviews can provide evidence of product defects, but manually sifting through hundreds of reviews is not always feasible. In this paper, we propose a system to mine Amazon.com reviews to identify products that may pose safety or health hazards. Since labeled data for this task are scarce, our approach combines positive unlabeled learning with domain adaptation to train a classifier from consumer complaints submitted to the U.S. Consumer Product Safety Commission. On a validation set of manually annotated Amazon product reviews, we find that our approach results in an absolute F1 score improvement of 8% over the best competing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Spam and Phishing Detection · Text and Document Classification Technologies
