Deep Exemplar Networks for VQA and VQG

Badri N. Patro; Vinay P. Namboodiri

arXiv:1912.09551·cs.CV·December 23, 2019·1 cites

Deep Exemplar Networks for VQA and VQG

Badri N. Patro, Vinay P. Namboodiri

PDF

Open Access

TL;DR

This paper introduces exemplar-based modules into deep learning architectures for VQA and VQG, demonstrating improved performance and generalization by mimicking human exemplar reliance.

Contribution

It proposes a novel exemplar-based approach that can be integrated into existing deep learning models for VQA and VQG, enhancing their effectiveness.

Findings

01

Exemplar modules improve VQA and VQG accuracy.

02

The approach generalizes across multiple architectures.

03

Empirical results outperform baseline models.

Abstract

In this paper, we consider the problem of solving semantic tasks such as `Visual Question Answering' (VQA), where one aims to answers related to an image and `Visual Question Generation' (VQG), where one aims to generate a natural question pertaining to an image. Solutions for VQA and VQG tasks have been proposed using variants of encoder-decoder deep learning based frameworks that have shown impressive performance. Humans however often show generalization by relying on exemplar based approaches. For instance, the work by Tversky and Kahneman suggests that humans use exemplars when making categorizations and decisions. In this work, we propose the incorporation of exemplar based approaches towards solving these problems. Specifically, we incorporate exemplar based approaches and show that an exemplar based module can be incorporated in almost any of the deep learning architectures…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning