Logically Consistent Loss for Visual Question Answering
Anh-Cat Le-Ngo, Truyen Tran, Santu Rana, Sunil Gupta, Svetha Venkatesh

TL;DR
This paper introduces a logic-based loss function and data organization techniques to improve the logical consistency and performance of neural network models in visual question answering tasks.
Contribution
It proposes a model-agnostic logically consistent loss and data organization methods that enhance answer consistency and accuracy in VQA systems.
Findings
Increased answer consistency in VQA models.
Improved performance with the proposed loss and data organization.
Applicable to various QA models beyond MAC-net.
Abstract
Given an image, a back-ground knowledge, and a set of questions about an object, human learners answer the questions very consistently regardless of question forms and semantic tasks. The current advancement in neural-network based Visual Question Answering (VQA), despite their impressive performance, cannot ensure such consistency due to identically distribution (i.i.d.) assumption. We propose a new model-agnostic logic constraint to tackle this issue by formulating a logically consistent loss in the multi-task learning framework as well as a data organisation called family-batch and hybrid-batch. To demonstrate usefulness of this proposal, we train and evaluate MAC-net based VQA machines with and without the proposed logically consistent loss and the proposed data organization. The experiments confirm that the proposed loss formulae and introduction of hybrid-batch leads to more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
