One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
Marc Rigter, Bruno Lacerda, Nick Hawes

TL;DR
This paper introduces a risk-sensitive model-based offline reinforcement learning approach that accounts for both epistemic and aleatoric uncertainties, improving safety and performance in stochastic environments.
Contribution
It proposes a unified risk-sensitive framework that jointly addresses distributional shift and stochasticity in offline RL, using risk-averse modeling of uncertainties.
Findings
Achieves competitive results on deterministic benchmarks.
Outperforms existing risk-sensitive methods in stochastic domains.
Effectively prevents distributional shift and poor outcomes.
Abstract
Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In such safety-critical settings, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-sensitive. Previous works on risk in offline RL combine together offline RL techniques, to avoid distributional shift, with risk-sensitive RL algorithms, to achieve risk-sensitivity. In this work, we propose risk-sensitivity as a mechanism to jointly address both of these issues. Our model-based approach is risk-averse to both epistemic and aleatoric uncertainty. Risk-aversion to epistemic uncertainty prevents distributional shift, as areas not covered by the dataset have high epistemic uncertainty. Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Multimodal Machine Learning Applications
