A Benchmark for Compositional Visual Reasoning
Aimen Zerroug, Mohit Vaishnav, Julien Colin, Sebastian Musslick,, Thomas Serre

TL;DR
This paper introduces a new benchmark, CVR, to evaluate and improve the data efficiency of AI systems in compositional visual reasoning, inspired by human learning and reasoning tests.
Contribution
The paper presents a novel benchmark for compositional visual reasoning, including a scalable method for creating abstract rule compositions and datasets, to assess and enhance AI sample efficiency.
Findings
Convolutional architectures outperform transformers in most data regimes.
All models are less data efficient than humans even with self-supervised learning.
The benchmark measures generalization, transfer, and compositionality in visual reasoning.
Abstract
A fundamental component of human vision is our ability to parse complex visual scenes and judge the relations between their constituent objects. AI benchmarks for visual reasoning have driven rapid progress in recent years with state-of-the-art systems now reaching human accuracy on some of these benchmarks. Yet, a major gap remains in terms of the sample efficiency with which humans and AI systems learn new visual reasoning tasks. Humans' remarkable efficiency at learning has been at least partially attributed to their ability to harness compositionality -- such that they can efficiently take advantage of previously gained knowledge when learning new tasks. Here, we introduce a novel visual reasoning benchmark, Compositional Visual Relations (CVR), to drive progress towards the development of more data-efficient learning algorithms. We take inspiration from fluidic intelligence and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Visual Attention and Saliency Detection · Image Retrieval and Classification Techniques
