Hierarchical Similarity Learning for Language-based Product Image   Retrieval

Zhe Ma; Fenghao Liu; Jianfeng Dong; Xiaoye Qu; Yuan He; Shouling Ji

arXiv:2102.09375·cs.CV·February 19, 2021·1 cites

Hierarchical Similarity Learning for Language-based Product Image Retrieval

Zhe Ma, Fenghao Liu, Jianfeng Dong, Xiaoye Qu, Yuan He, Shouling Ji

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Hierarchical Similarity Learning network that captures multi-level representations for more accurate language-based product image retrieval, outperforming previous methods by considering multiple granularities.

Contribution

The paper proposes a novel HSL network that models multi-granularity similarities at different levels, enhancing cross-modal retrieval performance.

Findings

01

Effective on large-scale product dataset

02

Improves matching accuracy by considering multiple granularities

03

Code and data are publicly available

Abstract

This paper aims for the language-based product image retrieval task. The majority of previous works have made significant progress by designing network structure, similarity measurement, and loss function. However, they typically perform vision-text matching at certain granularity regardless of the intrinsic multiple granularities of images. In this paper, we focus on the cross-modal similarity measurement, and propose a novel Hierarchical Similarity Learning (HSL) network. HSL first learns multi-level representations of input data by stacked encoders, and object-granularity similarity and image-granularity similarity are computed at each level. All the similarities are combined as the final hierarchical cross-modal similarity. Experiments on a large-scale product retrieval dataset demonstrate the effectiveness of our proposed method. Code and data are available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liufh1/hsl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Natural Language Processing Techniques