Exploiting Distribution Constraints for Scalable and Efficient Image   Retrieval

Mohammad Omama; Po-han Li; Sandeep P. Chinchali

arXiv:2410.07022·cs.IR·April 3, 2025

Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval

Mohammad Omama, Po-han Li, Sandeep P. Chinchali

PDF

Open Access 1 Video

TL;DR

This paper introduces AE-SVC and (SS)$_2$D, two novel methods that enhance the scalability and efficiency of image retrieval systems by improving foundation model embeddings and optimizing their size-performance trade-offs.

Contribution

It proposes AE-SVC for better foundation model embeddings and (SS)$_2$D for adaptive embedding sizes, addressing key challenges in scalable and efficient image retrieval.

Findings

01

AE-SVC improves retrieval performance by up to 16%.

02

(SS)$_2$D enhances performance by 10% for smaller embeddings.

03

Experiments conducted on four datasets with four foundation models.

Abstract

Image retrieval is crucial in robotics and computer vision, with downstream applications in robot place recognition and vision-based product recommendations. Modern retrieval systems face two key challenges: scalability and efficiency. State-of-the-art image retrieval systems train specific neural networks for each dataset, an approach that lacks scalability. Furthermore, since retrieval speed is directly proportional to embedding size, existing systems that use large embeddings lack efficiency. To tackle scalability, recent works propose using off-the-shelf foundation models. However, these models, though applicable across datasets, fall short in achieving performance comparable to that of dataset-specific models. Our key observation is that, while foundation models capture necessary subtleties for effective retrieval, the underlying distribution of their embedding space can negatively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval· slideslive

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Data Management and Algorithms

MethodsContrastive Language-Image Pre-training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings