Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli,, Joshua B. Tenenbaum

TL;DR
This paper introduces a neural-symbolic approach to visual question answering that combines deep learning with symbolic reasoning, achieving high accuracy, efficiency, and interpretability on complex reasoning tasks.
Contribution
It presents a novel neural-symbolic VQA system that integrates scene understanding with symbolic program execution, enhancing robustness, efficiency, and transparency.
Findings
Achieves 99.8% accuracy on CLEVR dataset
Requires less training data and storage
Provides interpretable reasoning steps
Abstract
We marry two powerful ideas: deep representation learning for visual recognition and language understanding, and symbolic program execution for reasoning. Our neural-symbolic visual question answering (NS-VQA) system first recovers a structural scene representation from the image and a program trace from the question. It then executes the program on the scene representation to obtain an answer. Incorporating symbolic structure as prior knowledge offers three unique advantages. First, executing programs on a symbolic space is more robust to long program traces; our model can solve complex reasoning tasks better, achieving an accuracy of 99.8% on the CLEVR dataset. Second, the model is more data- and memory-efficient: it performs well after learning on a small number of training data; it can also encode an image into a compact representation, requiring less storage than existing methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
