Aggregating Binary Local Descriptors for Image Retrieval

Giuseppe Amato; Fabrizio Falchi; Lucia Vadicamo

arXiv:1608.00813·cs.CV·March 3, 2017

Aggregating Binary Local Descriptors for Image Retrieval

Giuseppe Amato, Fabrizio Falchi, Lucia Vadicamo

PDF

TL;DR

This paper compares aggregation methods for binary local descriptors in image retrieval, formalizes Fisher Kernels for Bernoulli models, and explores combining binary features with CNNs, showing efficiency and improved accuracy.

Contribution

It provides an extensive comparison of aggregation methods for binary features, formalizes Fisher Kernels for Bernoulli Mixture Models, and demonstrates the benefits of combining binary features with CNNs.

Findings

01

Aggregation methods are effective for binary features.

02

Fisher Vector on binary features improves retrieval performance.

03

Binary features combined with CNNs outperform standalone CNN features.

Abstract

Content-Based Image Retrieval based on local features is computationally expensive because of the complexity of both extraction and matching of local feature. On one hand, the cost for extracting, representing, and comparing local visual descriptors has been dramatically reduced by recently proposed binary local features. On the other hand, aggregation techniques provide a meaningful summarization of all the extracted feature of an image into a single descriptor, allowing us to speed up and scale up the image search. Only a few works have recently mixed together these two research directions, defining aggregation methods for binary local features, in order to leverage on the advantage of both approaches. In this paper, we report an extensive comparison among state-of-the-art aggregation methods applied to binary features. Then, we mathematically formalize the application of Fisher…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings