Improving Robotic Grasping on Monocular Images Via Multi-Task Learning   and Positional Loss

William Prew; Toby Breckon; Magnus Bordewich; Ulrik Beierholm

arXiv:2011.02888·cs.RO·November 6, 2020

Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss

William Prew, Toby Breckon, Magnus Bordewich, Ulrik Beierholm

PDF

TL;DR

This paper presents two novel methods—multi-task learning with auxiliary depth reconstruction and a positional loss function—to enhance real-time robotic grasping accuracy from monocular images, achieving nearly 80% success rate.

Contribution

It introduces combined multi-task learning and positional loss techniques that significantly improve grasping performance and training efficiency in end-to-end CNN models.

Findings

01

Performance improved to 78.14% with multi-task learning.

02

Performance increased to 78.92% with positional loss.

03

Combined methods reach 79.12% success rate.

Abstract

In this paper, we introduce two methods of improving real-time object grasping performance from monocular colour images in an end-to-end CNN architecture. The first is the addition of an auxiliary task during model training (multi-task learning). Our multi-task CNN model improves grasping performance from a baseline average of 72.04% to 78.14% on the large Jacquard grasping dataset when performing a supplementary depth reconstruction task. The second is introducing a positional loss function that emphasises loss per pixel for secondary parameters (gripper angle and width) only on points of an object where a successful grasp can take place. This increases performance from a baseline average of 72.04% to 78.92% as well as reducing the number of training epochs required. These methods can be also performed in tandem resulting in a further performance increase to 79.12% while maintaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.