RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang, Feng Gao, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu

TL;DR
This paper introduces RAVEN, a new dataset based on Raven's Progressive Matrices that links visual perception with structural and relational reasoning, aiming to advance machine intelligence in high-level visual tasks.
Contribution
It presents a novel dataset that connects vision and reasoning through structured representations, enabling new abstract reasoning capabilities in machines.
Findings
Models show improved reasoning with the proposed neural module.
Human performance is provided as a benchmark.
The dataset facilitates evaluation of high-level visual reasoning.
Abstract
Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and tracking. Unfortunately, there is still an enormous performance gap between artificial vision systems and human intelligence in terms of higher-level vision problems, especially ones involving reasoning. Earlier attempts in equipping machines with high-level reasoning have hovered around Visual Question Answering (VQA), one typical task associating vision and language understanding. In this work, we propose a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. Unlike previous works in measuring abstract reasoning using RPM, we establish a semantic link between vision and reasoning by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
