Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop

Justin Kerr; Kush Hari; Ethan Weber; Chung Min Kim; Brent Yi; Tyler Bonnen; Ken Goldberg; Angjoo Kanazawa

arXiv:2506.10968·cs.RO·September 16, 2025

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop

Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, Tyler Bonnen, Ken Goldberg, Angjoo Kanazawa

PDF

Open Access

TL;DR

EyeRobot is a robotic system that learns to coordinate gaze and manipulation actions through a combined imitation and reinforcement learning approach, enabling effective hand-eye coordination for complex tasks.

Contribution

This work introduces a novel BC-RL loop for joint training of gaze and hand policies, with a foveal-inspired architecture for efficient and stable fixation behavior.

Findings

01

Emergence of hand-eye coordination behaviors.

02

Improved object tracking and distractor ignoring.

03

Effective manipulation over large workspaces.

Abstract

Humans do not passively observe the visual world -- we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by first collecting teleoperated demonstrations paired with a 360 camera. This data is imported into a simulation environment that supports rendering arbitrary eyeball viewpoints, allowing episode rollouts of eye gaze on top of robot demonstrations. We then introduce a BC-RL loop to train the hand and eye jointly: the hand (BC) agent is trained from rendered eye observations, and the eye (RL) agent is rewarded when the hand produces correct action predictions. In this way, hand-eye…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Social Robot Interaction and HRI · Robot Manipulation and Learning