Direct Random Search for Fine Tuning of Deep Reinforcement Learning   Policies

Sean Gillen; Asutay Ozmen; Katie Byl

arXiv:2109.05604·cs.RO·March 8, 2022

Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies

Sean Gillen, Asutay Ozmen, Katie Byl

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that direct random search effectively fine-tunes deterministic policies in deep reinforcement learning, resulting in more consistent and higher-performing agents across various environments.

Contribution

It introduces a simple yet effective method for fine-tuning DRL policies through direct random search, improving performance and consistency.

Findings

01

More consistent agent performance across environments.

02

Higher average rewards compared to baseline policies.

03

Effective extension to state space reduction techniques.

Abstract

Researchers have demonstrated that Deep Reinforcement Learning (DRL) is a powerful tool for finding policies that perform well on complex robotic systems. However, these policies are often unpredictable and can induce highly variable behavior when evaluated with only slightly different initial conditions. Training considerations constrain DRL algorithm designs in that most algorithms must use stochastic policies during training. The resulting policy used during deployment, however, can and frequently is a deterministic one that uses the Maximum Likelihood Action (MLA) at each step. In this work, we show that a direct random search is very effective at fine-tuning DRL policies by directly optimizing them using deterministic rollouts. We illustrate this across a large collection of reinforcement learning environments, using a wide variety of policies obtained from different algorithms.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sgillen/policy_refinement
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning

MethodsRandom Search