Adaptive Variance for Changing Sparse-Reward Environments

Xingyu Lin; Pengsheng Guo; Carlos Florensa; David Held

arXiv:1903.06309·cs.RO·May 10, 2019·1 cites

Adaptive Variance for Changing Sparse-Reward Environments

Xingyu Lin, Pengsheng Guo, Carlos Florensa, David Held

PDF

Open Access

TL;DR

This paper introduces a method to adapt the exploration variance of policies in changing sparse-reward environments, improving robot adaptability without explicitly modeling environmental changes.

Contribution

It provides a theoretical framework linking value functions to exploration variance, enabling effective policy adaptation in dynamic environments.

Findings

01

The proposed variance adjustment strategy improves exploration in changing environments.

02

The method enables faster adaptation compared to fixed variance policies.

03

The approach is effective across various sparse-reward scenarios.

Abstract

Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration. We propose a principled way to adapt the policy for better exploration in changing sparse-reward environments. Unlike previous works which explicitly model environmental changes, we analyze the relationship between the value function and the optimal exploration for a Gaussian-parameterized policy and show that our theory leads to an effective strategy for adjusting the variance of the policy, enabling fast adapt to changes in a variety of sparse-reward environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Gaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research