Aggregation of binary feature descriptors for compact scene model representation in large scale structure-from-motion applications
Jacek Komorowski, Tomasz Trzcinski

TL;DR
This paper introduces a method to aggregate binary feature descriptors into compact prototypes, significantly reducing memory usage and speeding up matching in large-scale structure-from-motion and SLAM applications.
Contribution
It proposes a novel aggregation technique that converts binary descriptors into low-dimensional real-valued prototypes for efficient large-scale 3D scene modeling.
Findings
Memory usage is significantly reduced.
Matching speed is improved using approximate nearest neighbor search.
The method is effective for large-scale structure-from-motion.
Abstract
In this paper we present an efficient method for aggregating binary feature descriptors to allow compact representation of 3D scene model in incremental structure-from-motion and SLAM applications. All feature descriptors linked with one 3D scene point or landmark are represented by a single low-dimensional real-valued vector called a \emph{prototype}. The method allows significant reduction of memory required to store and process feature descriptors in large-scale structure-from-motion applications. An efficient approximate nearest neighbours search methods suited for real-valued descriptors, such as FLANN, can be used on the resulting prototypes to speed up matching processed frames.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
