DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption

Yupeng Wu; Wenyun Li; Wenjie Huang; Chin Pang Ho

arXiv:2310.05179·cs.LG·March 2, 2026·1 cites

DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption

Yupeng Wu, Wenyun Li, Wenjie Huang, Chin Pang Ho

PDF

Open Access

TL;DR

DRL-ORA introduces a novel framework for distributional reinforcement learning that dynamically adjusts risk levels online, improving safety and efficiency by unifying uncertainty quantification and risk adaptation.

Contribution

It proposes a unified framework for online risk adaptation in distributional RL, combining uncertainty quantification with dynamic risk level adjustment.

Findings

01

Outperforms fixed risk level methods in various tasks.

02

Provides better explainability and flexibility in risk management.

03

Efficient risk level selection via grid search and online optimization.

Abstract

One of the main challenges in reinforcement learning (RL) is that the agent has to make decisions that would influence the future performance without having complete knowledge of the environment. Dynamically adjusting the level of epistemic risk during the learning process can help to achieve reliable policies in safety-critical settings with better efficiency. In this work, we propose a new framework, Distributional RL with Online Risk Adaptation (DRL-ORA). This framework quantifies both epistemic and implicit aleatory uncertainties in a unified manner and dynamically adjusts the epistemic risk levels by solving a total variation minimization problem online. The framework unifies the existing variants of risk adaption approaches and offers better explainability and flexibility. The selection of risk levels is performed efficiently via a grid search using a Follow-The-Leader-type…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing