Aggregation of binary feature descriptors for compact scene model   representation in large scale structure-from-motion applications

Jacek Komorowski; Tomasz Trzcinski

arXiv:1809.11062·cs.CV·October 1, 2018

Aggregation of binary feature descriptors for compact scene model representation in large scale structure-from-motion applications

Jacek Komorowski, Tomasz Trzcinski

PDF

Open Access

TL;DR

This paper introduces a method to aggregate binary feature descriptors into compact prototypes, significantly reducing memory usage and speeding up matching in large-scale structure-from-motion and SLAM applications.

Contribution

It proposes a novel aggregation technique that converts binary descriptors into low-dimensional real-valued prototypes for efficient large-scale 3D scene modeling.

Findings

01

Memory usage is significantly reduced.

02

Matching speed is improved using approximate nearest neighbor search.

03

The method is effective for large-scale structure-from-motion.

Abstract

In this paper we present an efficient method for aggregating binary feature descriptors to allow compact representation of 3D scene model in incremental structure-from-motion and SLAM applications. All feature descriptors linked with one 3D scene point or landmark are represented by a single low-dimensional real-valued vector called a \emph{prototype}. The method allows significant reduction of memory required to store and process feature descriptors in large-scale structure-from-motion applications. An efficient approximate nearest neighbours search methods suited for real-valued descriptors, such as FLANN, can be used on the resulting prototypes to speed up matching processed frames.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings