Hierarchical classification of e-commerce related social media

Matthew Long; Aditya Jami; Ashutosh Saxena

arXiv:1511.08299·cs.SI·November 30, 2015

Hierarchical classification of e-commerce related social media

Matthew Long, Aditya Jami, Ashutosh Saxena

PDF

Open Access

TL;DR

This paper explores classifying short, noisy tweets into Amazon's hierarchical categories using labeled, unlabeled, and review data, employing expansion techniques to enhance classification accuracy.

Contribution

It introduces a hierarchical classification approach for e-commerce related social media, addressing challenges of short text and noisy data with expansion techniques.

Findings

01

Expansion techniques yielded modest improvements

02

Hierarchical classification effectively organizes social media data

03

Challenges remain in handling short, misspelled tweets

Abstract

In this paper, we attempt to classify tweets into root categories of the Amazon browse node hierarchy using a set of tweets with browse node ID labels, a much larger set of tweets without labels, and a set of Amazon reviews. Examining twitter data presents unique challenges in that the samples are short (under 140 characters) and often contain misspellings or abbreviations that are trivial for a human to decipher but difficult for a computer to parse. A variety of query and document expansion techniques are implemented in an effort to improve information retrieval to modest success.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Natural Language Processing Techniques · Topic Modeling