Neural Modular Control for Embodied Question Answering

Abhishek Das; Georgia Gkioxari; Stefan Lee; Devi Parikh; Dhruv Batra

arXiv:1810.11181·cs.AI·May 6, 2019·19 cites

Neural Modular Control for Embodied Question Answering

Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

PDF

Open Access 2 Repos

TL;DR

This paper introduces a hierarchical, modular policy framework for embodied question answering that combines imitation and reinforcement learning, significantly improving navigation and answering accuracy in complex indoor environments.

Contribution

The paper proposes a novel hierarchical policy architecture with semantic subgoals, enhancing sample efficiency and adaptability for embodied question answering tasks.

Findings

01

Outperforms prior methods on the EQA benchmark

02

Improves navigation accuracy in realistic indoor environments

03

Enhances question answering performance

Abstract

We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. 'exit room', 'find kitchen', 'find refrigerator', etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling