Emergent Active Perception and Dexterity of Simulated Humanoids from Visual Reinforcement Learning
Zhengyi Luo, Chen Tessler, Toru Lin, Ye Yuan, Tairan He, Wenli Xiao, Yunrong Guo, Gal Chechik, Kris Kitani, Linxi Fan, Yuke Zhu

TL;DR
This paper presents Perceptive Dexterous Control (PDC), a vision-based framework enabling simulated humanoids to perform complex household tasks through active perception and reinforcement learning, without relying on privileged state information.
Contribution
Introduction of PDC, a novel vision-driven control framework allowing simulated humanoids to perform multiple tasks using egocentric vision and reinforcement learning, with emergent active search behaviors.
Findings
PDC enables humanoids to perform object search, grasping, and manipulation tasks.
Reinforcement learning from scratch produces human-like active search behaviors.
The approach demonstrates the importance of perception-action loops in embodied AI.
Abstract
Human behavior is fundamentally shaped by visual perception -- our ability to interact with the world depends on actively gathering relevant information and adapting our movements accordingly. Behaviors like searching for objects, reaching, and hand-eye coordination naturally emerge from the structure of our sensory system. Inspired by these principles, we introduce Perceptive Dexterous Control (PDC), a framework for vision-driven dexterous whole-body control with simulated humanoids. PDC operates solely on egocentric vision for task specification, enabling object search, target placement, and skill selection through visual cues, without relying on privileged state information (e.g., 3D object positions and geometries). This perception-as-interface paradigm enables learning a single policy to perform multiple household tasks, including reaching, grasping, placing, and articulated object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Motor Control and Adaptation · Social Robot Interaction and HRI
MethodsPrime Dilated Convolution
