IQ-VQA: Intelligent Visual Question Answering

Vatsal Goel; Mohit Chandak; Ashish Anand; Prithwijit Guha

arXiv:2007.04422·cs.CV·July 10, 2020

IQ-VQA: Intelligent Visual Question Answering

Vatsal Goel, Mohit Chandak, Ashish Anand, Prithwijit Guha

PDF

1 Repo

TL;DR

This paper introduces a cyclic framework for Visual Question Answering that enhances model consistency and robustness by training models to answer original questions, generate implications, and answer those implications, supported by a new dataset.

Contribution

A novel cyclic framework for VQA that improves consistency and robustness, along with a new annotated dataset of implications for evaluation.

Findings

01

Improves VQA consistency by ~15% on rule-based dataset

02

Enhances robustness by ~2% without performance loss

03

Shows better attention map quality indicating improved multi-modal understanding

Abstract

Even though there has been tremendous progress in the field of Visual Question Answering, models today still tend to be inconsistent and brittle. To this end, we propose a model-independent cyclic framework which increases consistency and robustness of any VQA architecture. We train our models to answer the original question, generate an implication based on the answer and then also learn to answer the generated implication correctly. As a part of the cyclic framework, we propose a novel implication generator which can generate implied questions from any question-answer pair. As a baseline for future works on consistency, we provide a new human annotated VQA-Implications dataset. The dataset consists of ~30k questions containing implications of 3 types - Logical Equivalence, Necessary Condition and Mutual Exclusion - made from the VQA v2.0 validation dataset. We show that our framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mchandak29/IQ-VQA
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.