Bootstrap Advantage Estimation for Policy Optimization in Reinforcement   Learning

Md Masudur Rahman; Yexiang Xue

arXiv:2210.07312·cs.LG·October 17, 2022

Bootstrap Advantage Estimation for Policy Optimization in Reinforcement Learning

Md Masudur Rahman, Yexiang Xue

PDF

Open Access 1 Repo

TL;DR

This paper introduces Bootstrap Advantage Estimation (BAE), a novel data augmentation technique for policy optimization in reinforcement learning that improves sample efficiency and generalization across diverse environments.

Contribution

The paper presents a new bootstrap advantage estimation method that enhances policy and value function updates using data augmentation, outperforming existing techniques like GAE, RAD, and DRAC.

Findings

01

BAE reduces policy and value loss more effectively than GAE.

02

BAE improves cumulative return across multiple benchmarks.

03

BAE demonstrates superior sample efficiency and generalization in unseen environments.

Abstract

This paper proposes an advantage estimation approach based on data augmentation for policy optimization. Unlike using data augmentation on the input to learn value and policy function as existing methods use, our method uses data augmentation to compute a bootstrap advantage estimation. This Bootstrap Advantage Estimation (BAE) is then used for learning and updating the gradient of policy and value function. To demonstrate the effectiveness of our approach, we conducted experiments on several environments. These environments are from three benchmarks: Procgen, Deepmind Control, and Pybullet, which include both image and vector-based observations; discrete and continuous action spaces. We observe that our method reduces the policy and the value loss better than the Generalized advantage estimation (GAE) method and eventually improves cumulative return. Furthermore, our method performs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

masud99r/bae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification