Deep Exemplar Networks for VQA and VQG
Badri N. Patro, Vinay P. Namboodiri

TL;DR
This paper introduces exemplar-based modules into deep learning architectures for VQA and VQG, demonstrating improved performance and generalization by mimicking human exemplar reliance.
Contribution
It proposes a novel exemplar-based approach that can be integrated into existing deep learning models for VQA and VQG, enhancing their effectiveness.
Findings
Exemplar modules improve VQA and VQG accuracy.
The approach generalizes across multiple architectures.
Empirical results outperform baseline models.
Abstract
In this paper, we consider the problem of solving semantic tasks such as `Visual Question Answering' (VQA), where one aims to answers related to an image and `Visual Question Generation' (VQG), where one aims to generate a natural question pertaining to an image. Solutions for VQA and VQG tasks have been proposed using variants of encoder-decoder deep learning based frameworks that have shown impressive performance. Humans however often show generalization by relying on exemplar based approaches. For instance, the work by Tversky and Kahneman suggests that humans use exemplars when making categorizations and decisions. In this work, we propose the incorporation of exemplar based approaches towards solving these problems. Specifically, we incorporate exemplar based approaches and show that an exemplar based module can be incorporated in almost any of the deep learning architectures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
