Large Scale Long-tailed Product Recognition System at Alibaba
Xiangzeng Zhou, Pan Pan, Yun Zheng, Yinghui Xu, Rong Jin

TL;DR
This paper introduces SICoT, a novel large-scale product recognition system at Alibaba that leverages side information and co-training to address long-tailed data imbalance, significantly improving recognition accuracy and user engagement.
Contribution
The paper proposes a new side information based co-training system, SICoT, that effectively transfers knowledge from head to tail classes in large-scale, imbalanced datasets.
Findings
SICoT effectively alleviates long tail problem in large-scale datasets.
The system demonstrates scalability across datasets with up to one million classes.
Implementation at Alibaba's platform improves user conversion rates.
Abstract
A practical large scale product recognition system suffers from the phenomenon of long-tailed imbalanced training data under the E-commercial circumstance at Alibaba. Besides product images at Alibaba, plenty of image related side information (e.g. title, tags) reveal rich semantic information about images. Prior works mainly focus on addressing the long tail problem in visual perspective only, but lack of consideration of leveraging the side information. In this paper, we present a novel side information based large scale visual recognition co-training~(SICoT) system to deal with the long tail problem by leveraging the image related side information. In the proposed co-training system, we firstly introduce a bilinear word attention module aiming to construct a semantic embedding over the noisy side information. A visual feature and semantic embedding co-training scheme is then designed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
