Hierarchical classification of e-commerce related social media
Matthew Long, Aditya Jami, Ashutosh Saxena

TL;DR
This paper explores classifying short, noisy tweets into Amazon's hierarchical categories using labeled, unlabeled, and review data, employing expansion techniques to enhance classification accuracy.
Contribution
It introduces a hierarchical classification approach for e-commerce related social media, addressing challenges of short text and noisy data with expansion techniques.
Findings
Expansion techniques yielded modest improvements
Hierarchical classification effectively organizes social media data
Challenges remain in handling short, misspelled tweets
Abstract
In this paper, we attempt to classify tweets into root categories of the Amazon browse node hierarchy using a set of tweets with browse node ID labels, a much larger set of tweets without labels, and a set of Amazon reviews. Examining twitter data presents unique challenges in that the samples are short (under 140 characters) and often contain misspellings or abbreviations that are trivial for a human to decipher but difficult for a computer to parse. A variety of query and document expansion techniques are implemented in an effort to improve information retrieval to modest success.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Natural Language Processing Techniques · Topic Modeling
