Embodiment-Agnostic Navigation Policy Trained with Visual Demonstrations
Nimrod Curtis, Osher Azulay, Avishai Sintov

TL;DR
This paper introduces ViDEN, a novel embodiment-agnostic navigation framework trained with visual demonstrations, enabling robots to navigate efficiently in diverse environments with minimal data and improved adaptability.
Contribution
ViDEN is the first framework to leverage depth images and relative target positions for embodiment-agnostic navigation using diffusion-based policies trained on visual demonstrations.
Findings
Outperforms existing methods in navigation tasks
Requires less data for training
Works effectively in indoor and outdoor scenarios
Abstract
Learning to navigate in unstructured environments is a challenging task for robots. While reinforcement learning can be effective, it often requires extensive data collection and can pose risk. Learning from expert demonstrations, on the other hand, offers a more efficient approach. However, many existing methods rely on specific robot embodiments, pre-specified target images and require large datasets. We propose the Visual Demonstration-based Embodiment-agnostic Navigation (ViDEN) framework, a novel framework that leverages visual demonstrations to train embodiment-agnostic navigation policies. ViDEN utilizes depth images to reduce input dimensionality and relies on relative target positions, making it more adaptable to diverse environments. By training a diffusion-based policy on task-centric and embodiment-agnostic demonstrations, ViDEN can generate collision-free and adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications
