Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping

Cristian Bodnar; Adrian Li; Karol Hausman; Peter Pastor; Mrinal; Kalakrishnan

arXiv:1910.02787·cs.RO·July 3, 2020

Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping

Cristian Bodnar, Adrian Li, Karol Hausman, Peter Pastor, Mrinal, Kalakrishnan

PDF

TL;DR

This paper introduces Quantile QT-Opt, a distributional reinforcement learning algorithm for vision-based robotic grasping, demonstrating improved success rates, sample efficiency, and risk management capabilities in complex real-world tasks.

Contribution

It proposes Quantile QT-Opt, a novel distributional RL method for continuous control, and evaluates its effectiveness in robotic grasping, including risk-aware decision making and batch RL comparisons.

Findings

01

Q2-Opt outperforms existing methods in grasp success rate.

02

Q2-Opt is more sample efficient in real robotic tasks.

03

Distributional approach enables risk management in robotic control.

Abstract

The distributional perspective on reinforcement learning (RL) has given rise to a series of successful Q-learning algorithms, resulting in state-of-the-art performance in arcade game environments. However, it has not yet been analyzed how these findings from a discrete setting translate to complex practical applications characterized by noisy, high dimensional and continuous state-action spaces. In this work, we propose Quantile QT-Opt (Q2-Opt), a distributional variant of the recently introduced distributed Q-learning algorithm for continuous domains, and examine its behaviour in a series of simulated and real vision-based robotic grasping tasks. The absence of an actor in Q2-Opt allows us to directly draw a parallel to the previous discrete experiments in the literature without the additional complexities induced by an actor-critic architecture. We demonstrate that Q2-Opt achieves a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning