Efficient and Robust Reinforcement Learning with Uncertainty-based Value   Expansion

Bo Zhou; Hongsheng Zeng; Fan Wang; Yunxiang Li; Hao Tian

arXiv:1912.05328·cs.LG·December 12, 2019·6 cites

Efficient and Robust Reinforcement Learning with Uncertainty-based Value Expansion

Bo Zhou, Hongsheng Zeng, Fan Wang, Yunxiang Li, Hao Tian

PDF

Open Access

TL;DR

This paper introduces RAVE, a hybrid reinforcement learning method that enhances robustness and performance by incorporating uncertainty modeling and risk aversion into value expansion techniques.

Contribution

The paper proposes RAVE, a novel hybrid RL algorithm that integrates probabilistic dynamics models and risk aversion to improve robustness and sample efficiency in stochastic environments.

Findings

01

RAVE outperforms previous methods in challenging environments.

02

Modeling uncertainty improves robustness of RL algorithms.

03

Achieved first place in NeurIPS 2019: Learn to Move.

Abstract

By integrating dynamics models into model-free reinforcement learning (RL) methods, model-based value expansion (MVE) algorithms have shown a significant advantage in sample efficiency as well as value estimation. However, these methods suffer from higher function approximation errors than model-free methods in stochastic environments due to a lack of modeling the environmental randomness. As a result, their performance lags behind the best model-free algorithms in some challenging scenarios. In this paper, we propose a novel Hybrid-RL method that builds on MVE, namely the Risk Averse Value Expansion (RAVE). With imaginative rollouts generated by an ensemble of probabilistic dynamics models, we further introduce the aversion of risks by seeking the lower confidence bound of the estimation. Experiments on a range of challenging environments show that by modeling the uncertainty…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Autonomous Vehicle Technology and Safety