Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction
Hyeonwoo Noh, Paul Hongsuck Seo, Bohyung Han

TL;DR
This paper introduces a CNN-based image question answering model with a dynamic parameter layer whose weights are adaptively predicted from questions using a GRU, employing hashing to manage complexity, achieving state-of-the-art results.
Contribution
It proposes a novel dynamic parameter prediction mechanism for CNNs in ImageQA, utilizing hashing to efficiently handle large parameter spaces, and demonstrates superior performance.
Findings
Achieves state-of-the-art results on public ImageQA benchmarks.
Introduces a hashing technique to reduce complexity in dynamic parameter prediction.
End-to-end training of the joint network enhances performance.
Abstract
We tackle image question answering (ImageQA) problem by learning a convolutional neural network (CNN) with a dynamic parameter layer whose weights are determined adaptively based on questions. For the adaptive parameter prediction, we employ a separate parameter prediction network, which consists of gated recurrent unit (GRU) taking a question as its input and a fully-connected layer generating a set of candidate weights as its output. However, it is challenging to construct a parameter prediction network for a large number of parameters in the fully-connected dynamic parameter layer of the CNN. We reduce the complexity of this problem by incorporating a hashing technique, where the candidate weights given by the parameter prediction network are selected using a predefined hash function to determine individual weights in the dynamic parameter layer. The proposed network---joint network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Image Question Answering Using Convolutional Neural Network With Dynamic Parameter Prediction· youtube
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsGated Recurrent Unit
