Just Ask:An Interactive Learning Framework for Vision and Language   Navigation

Ta-Chung Chi; Mihail Eric; Seokhwan Kim; Minmin Shen; Dilek; Hakkani-tur

arXiv:1912.00915·cs.AI·December 3, 2019·6 cites

Just Ask:An Interactive Learning Framework for Vision and Language Navigation

Ta-Chung Chi, Mihail Eric, Seokhwan Kim, Minmin Shen, Dilek, Hakkani-tur

PDF

Open Access

TL;DR

This paper introduces an interactive learning framework for vision and language navigation, enabling agents to ask questions when uncertain, significantly improving success rates and adapting to noisy responses through reinforcement learning and continual learning strategies.

Contribution

It presents a novel interactive learning framework with reinforcement learning and continual learning for vision-language navigation agents to ask questions effectively.

Findings

01

Success rate increased by at least 15% with minimal questions

02

Reinforcement learning enables dynamic interaction timing

03

Continual learning improves data efficiency and robustness

Abstract

In the vision and language navigation task, the agent may encounter ambiguous situations that are hard to interpret by just relying on visual information and natural language instructions. We propose an interactive learning framework to endow the agent with the ability to ask for users' help in such situations. As part of this framework, we investigate multiple learning approaches for the agent with different levels of complexity. The simplest model-confusion-based method lets the agent ask questions based on its confusion, relying on the predefined confidence threshold of a next action prediction model. To build on this confusion-based method, the agent is expected to demonstrate more sophisticated reasoning such that it discovers the timing and locations to interact with a human. We achieve this goal using reinforcement learning (RL) with a proposed reward shaping term, which enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics