Unraveling the geometry of visual relational reasoning

Jiaqi Shang; Gabriel Kreiman; Haim Sompolinsky

arXiv:2502.17382·q-bio.NC·July 28, 2025

Unraveling the geometry of visual relational reasoning

Jiaqi Shang, Gabriel Kreiman, Haim Sompolinsky

PDF

1 Repo

TL;DR

This paper introduces SimplifiedRPM, a new benchmark for evaluating abstract relational reasoning in neural networks, and analyzes how different models, especially SCL, generalize and align with human reasoning through geometric and layer-wise analysis.

Contribution

It presents SimplifiedRPM as a novel benchmark, compares multiple models including SCL, and develops a geometric framework to understand and improve relational reasoning in AI.

Findings

01

SCL best generalizes and aligns with human behavior

02

A geometric trade-off between signal and dimensionality affects generalization

03

Layer-wise analysis reveals where relational structure emerges

Abstract

Humans readily generalize abstract relations, such as recognizing "constant" in shape or color, whereas neural networks struggle, limiting their flexible reasoning. To investigate mechanisms underlying such generalization, we introduce SimplifiedRPM, a novel benchmark for systematically evaluating abstract relational reasoning, addressing limitations in prior datasets. In parallel, we conduct human experiments to quantify relational difficulty, enabling direct model-human comparisons. Testing four models, ResNet-50, Vision Transformer, Wild Relation Network, and Scattering Compositional Learner (SCL), we find that SCL generalizes best and most closely aligns with human behavior. Using a geometric approach, we identify key representation properties that accurately predict generalization and uncover a fundamental trade-off between signal and dimensionality: novel relations compress into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shaneshang/simplifiedrpm-visual-reasoning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Multi-Head Attention · Position-Wise Feed-Forward Layer