Explicit Knowledge-based Reasoning for Visual Question Answering
Peng Wang, Qi Wu, Chunhua Shen, Anton van den Hengel, Anthony Dick

TL;DR
This paper introduces a knowledge-based reasoning method for visual question answering that can handle complex questions, provide explanations, and outperforms LSTM-based approaches, supported by a new dataset and evaluation protocol.
Contribution
It presents a novel explicit knowledge reasoning approach for VQA, enabling complex question answering with explanations and establishing a new evaluation framework.
Findings
Outperforms LSTM-based methods significantly
Capable of answering more complex questions
Provides explanations for its answers
Abstract
We describe a method for visual question answering which is capable of reasoning about contents of an image on the basis of information extracted from a large-scale knowledge base. The method not only answers natural language questions using concepts not contained in the image, but can provide an explanation of the reasoning by which it developed its answer. The method is capable of answering far more complex questions than the predominant long short-term memory-based approach, and outperforms it significantly in the testing. We also provide a dataset and a protocol by which to evaluate such methods, thus addressing one of the key issues in general visual ques- tion answering.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
