RAM-VO: Less is more in Visual Odometry
Iury Cleveston, Esther L. Colombini

TL;DR
RAM-VO introduces a lightweight, efficient visual odometry model that uses minimal visual data and fewer parameters, achieving competitive results with less computational cost.
Contribution
The paper presents RAM-VO, a novel extension of the Recurrent Attention Model, optimized for visual odometry with reduced data and parameters, improving efficiency.
Findings
RAM-VO uses about 3 million parameters for 6-DOF regression.
It achieves competitive results on the KITTI dataset.
RAM-VO operates effectively with only 5.7% of visual information.
Abstract
Building vehicles capable of operating without human supervision requires the determination of the agent's pose. Visual Odometry (VO) algorithms estimate the egomotion using only visual changes from the input images. The most recent VO methods implement deep-learning techniques using convolutional neural networks (CNN) extensively, which add a substantial cost when dealing with high-resolution images. Furthermore, in VO tasks, more input data does not mean a better prediction; on the contrary, the architecture may filter out useless information. Therefore, the implementation of computationally efficient and lightweight architectures is essential. In this work, we propose the RAM-VO, an extension of the Recurrent Attention Model (RAM) for visual odometry tasks. RAM-VO improves the visual and temporal representation of information and implements the Proximal Policy Optimization (PPO)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Retinal Imaging and Analysis · Soft Robotics and Applications
