Neural Volumetric Memory for Visual Locomotion Control
Ruihan Yang, Ge Yang, Xiaolong Wang

TL;DR
This paper introduces Neural Volumetric Memory, a geometric memory architecture that improves visual-based locomotion control on challenging terrains by explicitly modeling 3D scene geometry and leveraging past observations.
Contribution
It proposes a novel neural volumetric memory architecture that explicitly encodes 3D geometry and SE(3) equivariance for improved terrain understanding in legged robots.
Findings
Outperforms naive methods in physical robot tests
Memory captures sufficient scene geometry for reconstruction
Explicit geometric priors enhance locomotion performance
Abstract
Legged robots have the potential to expand the reach of autonomy beyond paved roads. In this work, we consider the difficult problem of locomotion on challenging terrains using a single forward-facing depth camera. Due to the partial observability of the problem, the robot has to rely on past observations to infer the terrain currently beneath it. To solve this problem, we follow the paradigm in computer vision that explicitly models the 3D geometry of the scene and propose Neural Volumetric Memory (NVM), a geometric memory architecture that explicitly accounts for the SE(3) equivariance of the 3D world. NVM aggregates feature volumes from multiple camera views by first bringing them back to the ego-centric frame of the robot. We test the learned visual-locomotion policy on a physical robot and show that our approach, which explicitly introduces geometric priors during training, offers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotic Locomotion and Control · Cell Image Analysis Techniques
MethodsTest
