Cross-Modal Similarity Learning : A Low Rank Bilinear Formulation

Cuicui Kang; Shengcai Liao; Yonghao He; Jian Wang; Wenjia Niu; Shiming; Xiang; Chunhong Pan

arXiv:1411.4738·cs.MM·December 18, 2015·5 cites

Cross-Modal Similarity Learning : A Low Rank Bilinear Formulation

Cuicui Kang, Shengcai Liao, Yonghao He, Jian Wang, Wenjia Niu, Shiming, Xiang, Chunhong Pan

PDF

Open Access

TL;DR

This paper introduces a novel low-rank bilinear similarity learning method for cross-modal retrieval, effectively addressing heterogeneity and dimensionality issues between different media modalities, and demonstrating superior performance on benchmark datasets.

Contribution

It proposes a new low-rank bilinear formulation with nuclear-norm penalization for cross-modal similarity learning, improving over existing metric learning approaches.

Findings

01

Achieves state-of-the-art results on image-text retrieval datasets.

02

Uses accelerated proximal gradient for fast convergence.

03

Effectively handles heterogeneity and dimensionality in cross-modal features.

Abstract

The cross-media retrieval problem has received much attention in recent years due to the rapid increasing of multimedia data on the Internet. A new approach to the problem has been raised which intends to match features of different modalities directly. In this research, there are two critical issues: how to get rid of the heterogeneity between different modalities and how to match the cross-modal features of different dimensions. Recently metric learning methods show a good capability in learning a distance metric to explore the relationship between data points. However, the traditional metric learning algorithms only focus on single-modal features, which suffer difficulties in addressing the cross-modal features of different dimensions. In this paper, we propose a cross-modal similarity learning algorithm for the cross-modal feature matching. The proposed method takes a bilinear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications