Learning to Compose Neural Networks for Question Answering
Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Dan Klein

TL;DR
This paper introduces a dynamic neural network model that automatically composes neural modules for question answering across images and knowledge bases, learning from minimal supervision to achieve state-of-the-art results.
Contribution
It presents a novel method for automatically assembling neural networks from modules using reinforcement learning, applicable to multiple domains.
Findings
Achieved state-of-the-art results on visual question answering datasets.
Demonstrated effective joint learning of module parameters and network assembly.
Applicable to both image-based and structured knowledge base question answering.
Abstract
We describe a question answering model that applies to both images and structured knowledge bases. The model uses natural language strings to automatically assemble neural networks from a collection of composable modules. Parameters for these modules are learned jointly with network-assembly parameters via reinforcement learning, with only (world, question, answer) triples as supervision. Our approach, which we term a dynamic neural model network, achieves state-of-the-art results on benchmark datasets in both visual and structured domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
