Two-stage Rule-induction Visual Reasoning on RPMs with an Application to Video Prediction
Wentao He, Jianfeng Ren, Ruibin Bai, Xudong Jiang

TL;DR
This paper introduces TRIVR, a two-stage visual reasoner that models human-like reasoning for RPMs and applies it to real-world video prediction, achieving superior performance over existing methods.
Contribution
The paper proposes a novel two-stage rule-induction framework with a '2+1' reasoning formulation that derives explicit reasoning rules from RPM samples, enhancing interpretability and performance.
Findings
TRIVR outperforms state-of-the-art models on RPM-like datasets.
The '2+1' formulation reduces model complexity and improves reasoning accuracy.
Constructed RPM-like Video Prediction dataset validates real-world applicability.
Abstract
Raven's Progressive Matrices (RPMs) are frequently used in evaluating human's visual reasoning ability. Researchers have made considerable efforts in developing systems to automatically solve the RPM problem, often through a black-box end-to-end convolutional neural network for both visual recognition and logical reasoning tasks. Based on the two intrinsic natures of RPM problem, visual recognition and logical reasoning, we propose a Two-stage Rule-Induction Visual Reasoner (TRIVR), which consists of a perception module and a reasoning module, to tackle the challenges of real-world visual recognition and subsequent logical reasoning tasks, respectively. For the reasoning module, we further propose a "2+1" formulation that models human's thinking in solving RPMs and significantly reduces the model complexity. It derives a reasoning rule from each RPM sample, which is not feasible for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning in Bioinformatics
