A Deep Learning Model of Mental Rotation Informed by Interactive VR Experiments
Raymond Khazoum, Daniela Fernandes, Aleksandr Krylov, Qin Li, Stephane Deny

TL;DR
This paper presents a novel deep learning model of human mental rotation that integrates equivariant neural encoding, neuro-symbolic object descriptions, and a decision-making process, validated through VR experiments and behavioral data.
Contribution
It introduces a mechanistic, multi-component deep model of mental rotation informed by VR experiments, combining equivariant, neuro-symbolic, and neural decision modules.
Findings
Model accurately predicts human performance and response times.
Each component's necessity is confirmed through systematic ablations.
Model aligns with experimental data from VR and prior studies.
Abstract
Mental rotation -- the ability to compare objects seen from different viewpoints -- is a fundamental example of mental simulation and spatial world modelling in humans. Here we propose a mechanistic model of human mental rotation, leveraging advances in deep, equivariant, and neuro-symbolic learning. Our model consists of three stacked components: (1) an equivariant neural encoder, taking images as input and producing 3D spatial representations of objects, (2) a neuro-symbolic object encoder, deriving symbolic descriptions of objects from these spatial representations, and (3) a neural decision agent, comparing these symbolic descriptions to prescribe rotation simulations in 3D latent space via a recurrent pathway. Our model design is guided by the abundant experimental literature on mental rotation, which we complemented with experiments in VR where participants could at times…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial Cognition and Navigation · Constraint Satisfaction and Optimization · Human Motion and Animation
