Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval
Ben Chen, Linbo Jin, Xinxin Wang, Dehong Gao, Wen Jiang, Wei Ning

TL;DR
This paper introduces a unified vision-language model for e-commerce product retrieval that combines visual and textual features, improving accuracy especially in image-less industries, and is deployed on Alibaba.
Contribution
It proposes a novel joint embedding approach with a sampling strategy and contrastive loss for cross-modal product retrieval in e-commerce.
Findings
Superior retrieval performance demonstrated offline
Increased user clicks and conversions online
Deployed successfully on Alibaba platform
Abstract
Same-style products retrieval plays an important role in e-commerce platforms, aiming to identify the same products which may have different text descriptions or images. It can be used for similar products retrieval from different suppliers or duplicate products detection of one supplier. Common methods use the image as the detected object, but they only consider the visual features and overlook the attribute information contained in the textual descriptions, and perform weakly for products in image less important industries like machinery, hardware tools and electronic component, even if an additional text matching module is added. In this paper, we propose a unified vision-language modeling method for e-commerce same-style products retrieval, which is designed to represent one product with its textual descriptions and visual contents. It contains one sampling skill to collect positive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
