# The Promise of Premise: Harnessing Question Premises in Visual Question   Answering

**Authors:** Aroma Mahendru, Viraj Prabhu, Akrit Mohapatra, Dhruv Batra, Stefan Lee

arXiv: 1705.00601 · 2017-08-21

## TL;DR

This paper explores how reasoning about premises in visual questions can improve VQA models' ability to handle irrelevant or unseen questions by detecting false premises and enhancing reasoning capabilities.

## Contribution

It introduces a dataset for question relevance prediction based on false premises and demonstrates that premise reasoning improves VQA performance.

## Key findings

- Models reasoning about premises outperform those that do not.
- Question relevance detection models effectively identify false premises.
- Premise reasoning enhances compositional reasoning in VQA.

## Abstract

In this paper, we make a simple observation that questions about images often contain premises - objects and relationships implied by the question - and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions. When presented with a question that is irrelevant to an image, state-of-the-art VQA models will still answer purely based on learned language biases, resulting in non-sensical or even misleading answers. We note that a visual question is irrelevant to an image if at least one of its premises is false (i.e. not depicted in the image). We leverage this observation to construct a dataset for Question Relevance Prediction and Explanation (QRPE) by searching for false premises. We train novel question relevance detection models and show that models that reason about premises consistently outperform models that do not. We also find that forcing standard VQA models to reason about premises during training can lead to improvements on tasks requiring compositional reasoning.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.00601/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1705.00601/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1705.00601/full.md

---
Source: https://tomesphere.com/paper/1705.00601