Human Visual Understanding for Cognition and Manipulation -- A primer for the roboticist
Martin Hjelm

TL;DR
This paper reviews human visual understanding related to cognition and manipulation, highlighting principles like hierarchical processing and the separation of visual functions, to inform robotic research and address misconceptions from self-examination.
Contribution
It provides a comprehensive overview of human visual processing theories, emphasizing principles relevant for robotics and clarifying misconceptions from self-examination.
Findings
Identification of visual processing for action and cognition as separate pathways
Hierarchical and contextual processing principles in visual understanding
Critique of self-examination approaches in robotic research
Abstract
Robotic research is often built on approaches that are motivated by insights from self-examination of how we interface with the world. However, given current theories about human cognition and sensory processing, it is reasonable to assume that the internal workings of the brain are separate from how we interface with the world and ourselves. To amend some of these misconceptions arising from self-examination this article reviews human visual understanding for cognition and action, specifically manipulation. Our focus is on identifying overarching principles such as the separation into visual processing for action and cognition, hierarchical processing of visual input, and the contextual and anticipatory nature of visual processing for action. We also provide a rudimentary exposition of previous theories about visual understanding that shows how self-examination can lead down the wrong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Action Observation and Synchronization · Face Recognition and Perception
