Extending CLIP for Category-to-image Retrieval in E-commerce

Mariya Hendriksen; Maurits Bleeker; Svitlana Vakulenko; Nanne van; Noord; Ernst Kuiper; and Maarten de Rijke

arXiv:2112.11294·cs.IR·January 5, 2022·1 cites

Extending CLIP for Category-to-image Retrieval in E-commerce

Mariya Hendriksen, Maurits Bleeker, Svitlana Vakulenko, Nanne van, Noord, Ernst Kuiper, and Maarten de Rijke

PDF

Open Access 1 Repo

TL;DR

This paper introduces CLIP-ITA, a multimodal model for category-to-image retrieval in e-commerce, effectively combining textual, visual, and attribute data to improve search accuracy.

Contribution

The paper proposes a novel multimodal model, CLIP-ITA, specifically designed for category-to-image retrieval in e-commerce, leveraging multiple data modalities for enhanced performance.

Findings

01

CLIP-ITA outperforms visual-only models in retrieval tasks.

02

Adding attribute information improves model accuracy.

03

Multimodal integration enhances e-commerce search effectiveness.

Abstract

E-commerce provides rich multimodal data that is barely leveraged in practice. One aspect of this data is a category tree that is being used in search and recommendation. However, in practice, during a user's session there is often a mismatch between a textual and a visual representation of a given category. Motivated by the problem, we introduce the task of category-to-image retrieval in e-commerce and propose a model for the task, CLIP-ITA. The model leverages information from multiple modalities (textual, visual, and attribute modality) to create product representations. We explore how adding information from multiple modalities (textual, visual, and attribute modality) impacts the model's performance. In particular, we observe that CLIP-ITA significantly outperforms a comparable model that leverages only the visual modality and a comparable model that leverages the visual and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mariyahendriksen/ecir2022_category_to_image_retrieval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques