BEVNav: Robot Autonomous Navigation Via Spatial-Temporal Contrastive Learning in Bird's-Eye View
Jiahao Jiang, Yuxiang Yang, Yingqi Deng, Chenlong Ma, Jing Zhang

TL;DR
BEVNav introduces a novel spatial-temporal contrastive learning approach for bird's-eye view representations, significantly improving robot navigation in complex, map-less environments through deep reinforcement learning.
Contribution
It pioneers the integration of spatial-temporal contrastive learning with reinforcement learning for BEV-based robot navigation, enhancing decision-making in dynamic environments.
Findings
Outperforms state-of-the-art navigation methods in dense pedestrian scenarios.
Demonstrates robustness across multiple benchmark environments.
Achieves superior navigation success rates and reliability.
Abstract
Goal-driven mobile robot navigation in map-less environments requires effective state representations for reliable decision-making. Inspired by the favorable properties of Bird's-Eye View (BEV) in point clouds for visual perception, this paper introduces a novel navigation approach named BEVNav. It employs deep reinforcement learning to learn BEV representations and enhance decision-making reliability. First, we propose a self-supervised spatial-temporal contrastive learning approach to learn BEV representations. Spatially, two randomly augmented views from a point cloud predict each other, enhancing spatial features. Temporally, we combine the current observation with consecutive frames' actions to predict future features, establishing the relationship between observation transitions and actions to capture temporal cues. Then, incorporating this spatial-temporal contrastive learning in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
