l-dyno: framework to learn consistent visual features using robot's motion
Kartikeya Singh, Charuvaran Adhivarahan, Karthik Dantu

TL;DR
l-dyno is a framework that learns visual features aligned with robot motion, improving efficiency and accuracy in visual odometry by selecting consistent features based on external signals like inertial data.
Contribution
It introduces a novel representation learning approach that identifies motion-consistent visual features using external signals, enhancing feature selection for robot perception tasks.
Findings
49% reduction in image search space
4.3% reduction in visual odometry execution time
Lower reprojection errors
Abstract
Historically, feature-based approaches have been used extensively for camera-based robot perception tasks such as localization, mapping, tracking, and others. Several of these approaches also combine other sensors (inertial sensing, for example) to perform combined state estimation. Our work rethinks this approach; we present a representation learning mechanism that identifies visual features that best correspond to robot motion as estimated by an external signal. Specifically, we utilize the robot's transformations through an external signal (inertial sensing, for example) and give attention to image space that is most consistent with the external signal. We use a pairwise consistency metric as a representation to keep the visual features consistent through a sequence with the robot's relative pose transformations. This approach enables us to incorporate information from the robot's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
