Large e-retailer image dataset for visual search and product classification
Arnaud Bell\'etoile (for the Cdiscount datascience team)

TL;DR
This paper introduces a large, diverse dataset of over 12 million images of 7 million products for visual search and classification, aiming to enhance deep learning applications in e-commerce.
Contribution
The paper provides a new extensive dataset for visual recognition in e-commerce and shares insights from a Kaggle classification challenge using this dataset.
Findings
The dataset contains over 12 million images across 5,000 categories.
Winning solutions from the Kaggle challenge demonstrate effective classification strategies.
The dataset facilitates research in visual search and product recommendation systems.
Abstract
Recent results of deep convolutional networks in visual recognition challenges open the path to a whole new set of disruptive user experiences such as visual search or recommendation. The list of companies offering this type of service is growing everyday but the adoption rate and the relevancy of results may vary a lot. We believe that the availability of large and diverse datasets is a necessary condition to improve the relevancy of such recommendation systems and facilitate their adoption. For that purpose, we wish to share with the community this dataset of more than 12M images of the 7M products of our online store classified into 5K categories. This original dataset is introduced in this article and several features are described. We also present some aspects of the winning solutions of our image classification challenge that was organized on the Kaggle platform around this set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Advanced Neural Network Applications
