Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction
Valts Blukis, Dipendra Misra, Ross A. Knepper, Yoav Artzi

TL;DR
This paper introduces a model that translates natural language instructions into continuous drone control actions by predicting position-visitation distributions, leading to improved task accuracy in simulation.
Contribution
The paper presents a novel two-step model that combines interpretable visitation prediction with control, enabling efficient training and superior performance over existing methods.
Findings
Achieved 16.85% higher task completion accuracy in simulation.
Demonstrated effective mapping of instructions to control actions.
Validated approach with realistic drone simulator.
Abstract
We propose an approach for mapping natural language instructions and raw observations to continuous control of a quadcopter drone. Our model predicts interpretable position-visitation distributions indicating where the agent should go during execution and where it should stop, and uses the predicted distributions to select the actions to execute. This two-step model decomposition allows for simple and efficient training using a combination of supervised learning and imitation learning. We evaluate our approach with a realistic drone simulator, and demonstrate absolute task-completion accuracy improvements of 16.85% over two state-of-the-art instruction-following methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotic Path Planning Algorithms · Advanced Neural Network Applications
