Bingham Policy Parameterization for 3D Rotations in Reinforcement   Learning

Stephen James; Pieter Abbeel

arXiv:2202.03957·cs.RO·February 9, 2022·6 cites

Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

Stephen James, Pieter Abbeel

PDF

Open Access 1 Repo

TL;DR

This paper introduces Bingham Policy Parameterization (BPP), a novel method for representing 3D rotations in reinforcement learning, which outperforms Gaussian policies in rotation-specific tasks.

Contribution

The paper proposes BPP, a new policy parameterization based on the Bingham distribution, tailored for better 3D rotation prediction in reinforcement learning environments.

Findings

01

BPP outperforms Gaussian policies in rotation tasks.

02

BPP improves performance on the Wahba problem and RLBench robot tasks.

03

Encourages development of environment-specific policy parameterizations.

Abstract

We propose a new policy parameterization for representing 3D rotations during reinforcement learning. Today in the continuous control reinforcement learning literature, many stochastic policy parameterizations are Gaussian. We argue that universally applying a Gaussian policy parameterization is not always desirable for all environments. One such case in particular where this is true are tasks that involve predicting a 3D rotation output, either in isolation, or coupled with translation as part of a full 6D pose output. Our proposed Bingham Policy Parameterization (BPP) models the Bingham distribution and allows for better rotation (quaternion) prediction over a Gaussian policy parameterization in a range of reinforcement learning tasks. We evaluate BPP on the rotation Wahba problem task, as well as a set of vision-based next-best pose robot manipulation tasks from RLBench. We hope that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stepjam/BPP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Mechanisms and Dynamics · Reinforcement Learning in Robotics · Hereditary Neurological Disorders