The multi-modal universe of fast-fashion: the Visuelle 2.0 benchmark
Geri Skenderi, Christian Joppi, Matteo Denitto, Berniero Scarpa, Marco, Cristani

TL;DR
Visuelle 2.0 is a new multi-modal dataset designed for fast-fashion sales prediction, demonstrating that image data combined with deep learning improves short-term sales forecasting accuracy.
Contribution
The paper introduces Visuelle 2.0, a comprehensive dataset with multi-modal data for fast-fashion, and shows how computer vision enhances short-term sales forecasting.
Findings
Image data improves forecasting accuracy by up to 7%.
Deep networks leveraging images outperform baseline methods.
Multi-modal data is crucial for short-term sales prediction.
Abstract
We present Visuelle 2.0, the first dataset useful for facing diverse prediction problems that a fast-fashion company has to manage routinely. Furthermore, we demonstrate how the use of computer vision is substantial in this scenario. Visuelle 2.0 contains data for 6 seasons / 5355 clothing products of Nuna Lie, a famous Italian company with hundreds of shops located in different areas within the country. In particular, we focus on a specific prediction problem, namely short-observation new product sale forecasting (SO-fore). SO-fore assumes that the season has started and a set of new products is on the shelves of the different stores. The goal is to forecast the sales for a particular horizon, given a short, available past (few weeks), since no earlier statistics are available. To be successful, SO-fore approaches should capture this short past and exploit other modalities or exogenous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAesthetic Perception and Analysis · Visual Attention and Saliency Detection · Color Science and Applications
MethodsMasked autoencoder
