SOrT-ing VQA Models : Contrastive Gradient Learning for Improved   Consistency

Sameer Dharur; Purva Tendulkar; Dhruv Batra; Devi Parikh; Ramprasaath; R. Selvaraju

arXiv:2010.10038·cs.CV·December 2, 2020

SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency

Sameer Dharur, Purva Tendulkar, Dhruv Batra, Devi Parikh, Ramprasaath, R. Selvaraju

PDF

1 Repo

TL;DR

This paper introduces SOrT, a contrastive gradient learning method that enhances VQA models' consistency by better understanding and ranking relevant sub-questions, leading to improved reasoning accuracy and visual grounding.

Contribution

The paper proposes a novel contrastive gradient learning approach called SOrT to improve VQA model consistency and sub-question relevance understanding.

Findings

01

SOrT improves model consistency by up to 6.5 percentage points.

02

SOrT enhances visual grounding accuracy.

03

Gradient-based interpretability helps evaluate sub-question relevance.

Abstract

Recent research in Visual Question Answering (VQA) has revealed state-of-the-art models to be inconsistent in their understanding of the world -- they answer seemingly difficult questions requiring reasoning correctly but get simpler associated sub-questions wrong. These sub-questions pertain to lower level visual concepts in the image that models ideally should understand to be able to answer the higher level question correctly. To address this, we first present a gradient-based interpretability approach to determine the questions most strongly correlated with the reasoning question on an image, and use this to evaluate VQA models on their ability to identify the relevant sub-questions needed to answer a reasoning question. Next, we propose a contrastive gradient learning based approach called Sub-question Oriented Tuning (SOrT) which encourages models to rank relevant sub-questions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sameerdharur/sorting-vqa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability