Attribute Extraction from Product Titles in eCommerce

Ajinkya More

arXiv:1608.04670·cs.CL·August 17, 2016·20 cites

Attribute Extraction from Product Titles in eCommerce

Ajinkya More

PDF

Open Access

TL;DR

This paper introduces a system for extracting product attributes from short eCommerce titles, combining sequence labeling algorithms with normalization to improve accuracy in a challenging, syntactically sparse context.

Contribution

The paper presents a novel attribute extraction system that effectively combines sequence labeling algorithms with normalization for eCommerce product titles.

Findings

01

Effective extraction of product attributes demonstrated

02

Comparison shows improved performance over baseline methods

03

System performs well on short, unstructured product titles

Abstract

This paper presents a named entity extraction system for detecting attributes in product titles of eCommerce retailers like Walmart. The absence of syntactic structure in such short pieces of text makes extracting attribute values a challenging problem. We find that combining sequence labeling algorithms such as Conditional Random Fields and Structured Perceptron with a curated normalization scheme produces an effective system for the task of extracting product attribute values from titles. To keep the discussion concrete, we will illustrate the mechanics of the system from the point of view of a particular attribute - brand. We also discuss the importance of an attribute extraction system in the context of retail websites with large product catalogs, compare our approach to other potential approaches to this problem and end the paper with a discussion of the performance of our system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Advanced Text Analysis Techniques · Topic Modeling