Same-different problems strain convolutional neural networks
Matthew Ricci, Junkyung Kim, Thomas Serre

TL;DR
This paper demonstrates that convolutional neural networks struggle with visual relation tasks, especially under high variability, and suggests feedback mechanisms like attention may be essential for abstract visual reasoning.
Contribution
The study reveals the limitations of CNNs in learning visual relations and highlights the potential role of feedback mechanisms inspired by biological vision.
Findings
CNNs fail on visual relation tasks with high intra-class variability
Networks break down when memorization is impossible
Feedback mechanisms may improve abstract visual reasoning
Abstract
The robust and efficient recognition of visual relations in images is a hallmark of biological vision. We argue that, despite recent progress in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The networks eventually break altogether when rote memorization becomes impossible, as when intra-class variability exceeds network capacity. Motivated by the comparable success of biological vision, we argue that feedback mechanisms including attention and perceptual grouping may be the key computational components underlying abstract visual reasoning.\
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Technology and Control Systems · Advanced Sensor and Control Systems
