Coupling Vision and Proprioception for Navigation of Legged Robots
Zipeng Fu, Ashish Kumar, Ananye Agarwal, Haozhi Qi, Jitendra Malik,, Deepak Pathak

TL;DR
This paper presents VP-Nav, a navigation system for legged robots that combines vision and proprioception to improve path safety and adaptability in complex environments, demonstrated through real-world deployment.
Contribution
It introduces a novel integration of vision and proprioception for high-level navigation planning in legged robots, enhancing safety and terrain adaptability.
Findings
Superior performance over wheeled robot baselines
Effective detection of unexpected obstacles and terrain properties
Successful real-world deployment on a quadruped robot
Abstract
We exploit the complementary strengths of vision and proprioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully utilize this capability, we need a high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy in varying environments. We achieve this by using proprioceptive feedback to ensure the safety of the planned path by sensing unexpected obstacles like glass walls, terrain properties like slipperiness or softness of the ground and robot properties like extra payload that are likely missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. A fast marching planner then generates a target path. A velocity command…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Rabies epidemiology and control · Virology and Viral Diseases
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Attentive Walk-Aggregating Graph Neural Network
