Contrastively Learning Visual Attention as Affordance Cues from Demonstrations for Robotic Grasping
Yantian Zha, Siddhant Bhambri, Lin Guan

TL;DR
This paper introduces a contrastive learning approach that uses visual attention as affordance cues in an end-to-end imitation learning framework for robotic grasping, eliminating the need for explicit grasp configuration prediction.
Contribution
It proposes a novel contrastive learning framework with a coupled triplet loss to discover affordance cues as visual attention, bridging affordance discovery and policy learning.
Findings
Achieves highest grasping success rate in simulation
Effectively learns affordance cues as visual attention
Outperforms baseline methods in grasping tasks
Abstract
Conventional works that learn grasping affordance from demonstrations need to explicitly predict grasping configurations, such as gripper approaching angles or grasping preshapes. Classic motion planners could then sample trajectories by using such predicted configurations. In this work, our goal is instead to fill the gap between affordance discovery and affordance-based policy learning by integrating the two objectives in an end-to-end imitation learning framework based on deep neural networks. From a psychological perspective, there is a close association between attention and affordance. Therefore, with an end-to-end neural network, we propose to learn affordance cues as visual attention that serves as a useful indicating signal of how a demonstrator accomplishes tasks, instead of explicitly modeling affordances. To achieve this, we propose a contrastive learning framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Motor Control and Adaptation
MethodsContrastive Learning · Triplet Loss
