What You See is What You Grasp: User-Friendly Grasping Guided by Near-eye-tracking
Shaochen Wang, Wei Zhang, Zhangli Zhou, Jiaxi Cao, Ziyang Chen, Kang, Chen, Bin Li, and Zhen Kan

TL;DR
This paper introduces a human-robot interface that uses near-eye-tracking to infer user intentions and guide robotic grasping, enabling intuitive sight-based manipulation for assistive applications.
Contribution
It presents a novel system combining near-eye-tracking with a transformer-based grasp model for sight-guided robotic manipulation.
Findings
Low gaze estimation error achieved
Promising grasping results on multiple datasets
Effective integration of eye-tracking with robotic control
Abstract
This work presents a next-generation human-robot interface that can infer and realize the user's manipulation intention via sight only. Specifically, we develop a system that integrates near-eye-tracking and robotic manipulation to enable user-specified actions (e.g., grasp, pick-and-place, etc), where visual information is merged with human attention to create a mapping for desired robot actions. To enable sight guided manipulation, a head-mounted near-eye-tracking device is developed to track the eyeball movements in real-time, so that the user's visual attention can be identified. To improve the grasping performance, a transformer based grasp model is then developed. Stacked transformer blocks are used to extract hierarchical features where the volumes of channels are expanded at each stage while squeezing the resolution of feature maps. Experimental validation demonstrates that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Stroke Rehabilitation and Recovery · Soft Robotics and Applications
