Interpretable Methods for Identifying Product Variants
Rebecca West, Khalifeh Al Jadda, Unaiza Ahsan, Huiming Qu, Xiquan Cui

TL;DR
This paper presents a novel, interpretable method combining constrained clustering and NLP techniques to accurately identify product variants across diverse e-commerce categories, enhancing product organization and customer experience.
Contribution
It introduces a new approach that integrates clustering and NLP for product variant identification, emphasizing interpretability and high accuracy across multiple categories.
Findings
Outperforms baseline classification methods in identifying product variants.
Achieves high accuracy across diverse product categories.
Provides an interpretable model accessible to business partners.
Abstract
For e-commerce companies with large product selections, the organization and grouping of products in meaningful ways is important for creating great customer shopping experiences and cultivating an authoritative brand image. One important way of grouping products is to identify a family of product variants, where the variants are mostly the same with slight and yet distinct differences (e.g. color or pack size). In this paper, we introduce a novel approach to identifying product variants. It combines both constrained clustering and tailored NLP techniques (e.g. extraction of product family name from unstructured product title and identification of products with similar model numbers) to achieve superior performance compared with an existing baseline using a vanilla classification approach. In addition, we design the algorithm to meet certain business criteria, including meeting high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Text and Document Classification Technologies · Advanced Text Analysis Techniques
