Compositional Learning of Image-Text Query for Image Retrieval

Muhammad Umer Anwaar; Egor Labintcev; Martin Kleinsteuber

arXiv:2006.11149·cs.CV·June 2, 2021

Compositional Learning of Image-Text Query for Image Retrieval

Muhammad Umer Anwaar, Egor Labintcev, Martin Kleinsteuber

PDF

1 Repo

TL;DR

This paper introduces ComposeAE, a novel autoencoder-based model for multi-modal image-text query composition in image retrieval, outperforming existing methods on several benchmark datasets.

Contribution

We propose ComposeAE, a new autoencoder model with a deep metric learning approach and rotational symmetry constraint for improved image retrieval with image-text queries.

Findings

01

Outperforms TIRG on MIT-States, Fashion200k, and Fashion IQ datasets.

02

Introduces a rotational symmetry constraint to enhance model performance.

03

Provides strong baselines and reproducible code for future research.

Abstract

In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications. For instance, a user of an E-Commerce platform is interested in buying a dress, which should look similar to her friend's dress, but the dress should be of white color with a ribbon sash. In this case, we would like the algorithm to retrieve some dresses with desired modifications in the query dress. We propose an autoencoder based model, ComposeAE, to learn the composition of image and text query for retrieving images. We adopt a deep metric learning approach and learn a metric that pushes composition of source image and text query closer to the target images. We also propose a rotational symmetry constraint on the optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ecom-research/ComposeAE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729