Deep Visual Navigation under Partial Observability
Bo Ai, Wei Gao, Vinay, David Hsu

TL;DR
This paper presents a neural network-based controller for robot visual navigation in complex, partially observable environments, integrating multi-scale spatial representations, temporal encoding, and multimodal behaviors to improve robustness.
Contribution
It introduces a unified neural network architecture combining CNNs, LSTMs, and behavior-specific memory modules for improved visual navigation under partial observability.
Findings
Significant performance improvement in navigation tasks
Effective handling of complex visual and partial observations
Successful implementation on quadrupedal robot in real-world scenarios
Abstract
How can a robot navigate successfully in rich and diverse environments, indoors or outdoors, along office corridors or trails on the grassland, on the flat ground or the staircase? To this end, this work aims to address three challenges: (i) complex visual observations, (ii) partial observability of local visual sensing, and (iii) multimodal robot behaviors conditioned on both the local environment and the global navigation objective. We propose to train a neural network (NN) controller for local navigation via imitation learning. To tackle complex visual observations, we extract multi-scale spatial representations through CNNs. To tackle partial observability, we aggregate multi-scale spatial information over time and encode it in LSTMs. To learn multimodal behaviors, we use a separate memory module for each behavior mode. Importantly, we integrate the multiple neural network modules…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods · Advanced Vision and Imaging
MethodsConvolution
