NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations
Qiaoyun Wu, Dinesh Manocha, Jun Wang, Kai Xu

TL;DR
NeoNav introduces a variational Bayesian generative model that predicts next expected observations to enhance the generalization ability of visual navigation agents across new targets and scenes.
Contribution
The paper presents NeoNav, a novel generative model with a mixture-of-Gaussians prior that improves cross-target and cross-scene generalization in visual navigation tasks.
Findings
Outperforms state-of-the-art in success rate
Enhances data efficiency in navigation
Shows robust generalization in real-world and synthetic benchmarks
Abstract
We propose improving the cross-target and cross-scene generalization of visual navigation through learning an agent that is guided by conceiving the next observations it expects to see. This is achieved by learning a variational Bayesian model, called NeoNav, which generates the next expected observations (NEO) conditioned on the current observations of the agent and the target view. Our generative model is learned through optimizing a variational objective encompassing two key designs. First, the latent distribution is conditioned on current observations and the target view, leading to a model-based, target-driven navigation. Second, the latent space is modeled with a Mixture of Gaussians conditioned on the current observation and the next best action. Our use of mixture-of-posteriors prior effectively alleviates the issue of over-regularized latent space, thus significantly boosting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
