Toward Sim-to-Real Directional Semantic Grasping
Shariq Iqbal, Jonathan Tremblay, Thang To, Jia Cheng, Erik Leitch,, Andy Campbell, Kirby Leung, Duncan McKay, Stan Birchfield

TL;DR
This paper presents a deep reinforcement learning approach for directional semantic grasping, using simulated data and domain randomization to enable real-world application of robot grasping from specific directions.
Contribution
It introduces an end-to-end system that maps RGB images to robot commands for directional grasping, bridging the sim-to-real gap with domain randomization.
Findings
Successful simulation and real-world grasping demonstrations
Effective use of DDQN and CEM for control
Identification of challenges for future research
Abstract
We address the problem of directional semantic grasping, that is, grasping a specific object from a specific direction. We approach the problem using deep reinforcement learning via a double deep Q-network (DDQN) that learns to map downsampled RGB input images from a wrist-mounted camera to Q-values, which are then translated into Cartesian robot control commands via the cross-entropy method (CEM). The network is learned entirely on simulated data generated by a custom robot simulator that models both physical reality (contacts) and perceptual quality (high-quality rendering). The reality gap is bridged using domain randomization. The system is an example of end-to-end (mapping input monocular RGB images to output Cartesian motor commands) grasping of objects from multiple pre-defined object-centric orientations, such as from the side or top. We show promising results in both simulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
