Estimating Risk and Uncertainty in Deep Reinforcement Learning
William R. Clements, Bastien Van Delft, Beno\^it-Marie Robaglia, Reda, Bahi Slaoui, S\'ebastien Toth

TL;DR
This paper presents a framework for disentangling and estimating epistemic and aleatoric uncertainties in deep reinforcement learning, enabling safer exploration and risk-sensitive decision-making.
Contribution
The authors introduce an unbiased estimator for both uncertainties and an uncertainty-aware DQN algorithm that improves safety and performance.
Findings
The proposed method accurately estimates uncertainties in RL agents.
Uncertainty-aware DQN outperforms standard variants on MinAtar.
The framework enhances safe exploration in stochastic environments.
Abstract
Reinforcement learning agents are faced with two types of uncertainty. Epistemic uncertainty stems from limited data and is useful for exploration, whereas aleatoric uncertainty arises from stochastic environments and must be accounted for in risk-sensitive applications. We highlight the challenges involved in simultaneously estimating both of them, and propose a framework for disentangling and estimating these uncertainties on learned Q-values. We derive unbiased estimators of these uncertainties and introduce an uncertainty-aware DQN algorithm, which we show exhibits safe learning behavior and outperforms other DQN variants on the MinAtar testbed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning
