Policy Augmentation: An Exploration Strategy for Faster Convergence of   Deep Reinforcement Learning Algorithms

Arash Mahyari

arXiv:2102.05249·cs.LG·February 11, 2021

Policy Augmentation: An Exploration Strategy for Faster Convergence of Deep Reinforcement Learning Algorithms

Arash Mahyari

PDF

1 Repo

TL;DR

This paper introduces Policy Augmentation, a novel exploration strategy for deep reinforcement learning that uses inductive matrix completion to improve exploration and accelerate convergence, outperforming existing methods.

Contribution

The paper presents a new exploration algorithm, Policy Augmentation, based on inductive matrix completion, enhancing exploration efficiency and convergence speed in deep reinforcement learning.

Findings

01

Policy Augmentation improves exploration in early episodes.

02

The method accelerates convergence of deep RL algorithms.

03

Experimental results show superior performance over existing strategies.

Abstract

Despite advancements in deep reinforcement learning algorithms, developing an effective exploration strategy is still an open problem. Most existing exploration strategies either are based on simple heuristics, or require the model of the environment, or train additional deep neural networks to generate imagination-augmented paths. In this paper, a revolutionary algorithm, called Policy Augmentation, is introduced. Policy Augmentation is based on a newly developed inductive matrix completion method. The proposed algorithm augments the values of unexplored state-action pairs, helping the agent take actions that will result in high-value returns while the agent is in the early episodes. Training deep reinforcement learning algorithms with high-value rollouts leads to the faster convergence of deep reinforcement learning algorithms. Our experiments show the superior performance of Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arashmahyari/PolicyAugmentation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.