Supervising the Transfer of Reasoning Patterns in VQA
Corentin Kervadec, Christian Wolf, Grigory Antipov, Moez Baccouche and, Madiha Nadri

TL;DR
This paper introduces a regularization-based method for transferring reasoning patterns in VQA models, improving their reasoning capabilities and generalization by supervising reasoning operations during training.
Contribution
It proposes a novel regularization technique for knowledge transfer of reasoning patterns, supported by PAC-learning theory and experimental validation on GQA dataset.
Findings
Improved reasoning pattern transfer in VQA models.
Reduced sample complexity for program prediction.
Complementary effects with BERT-like pre-training.
Abstract
Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset biases rather than performing reasoning, hindering generalization. It has been recently shown that better reasoning patterns emerge in attention layers of a state-of-the-art VQA model when they are trained on perfect (oracle) visual inputs. This provides evidence that deep neural networks can learn to reason when training conditions are favorable enough. However, transferring this learned knowledge to deployable models is a challenge, as much of it is lost during the transfer. We propose a method for knowledge transfer based on a regularization term in our loss function, supervising the sequence of required reasoning operations. We provide a theoretical analysis based on PAC-learning, showing that such program prediction can lead to decreased sample complexity under mild hypotheses. We also demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAI-based Problem Solving and Planning · Multimodal Machine Learning Applications · Topic Modeling
