DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption
Yupeng Wu, Wenyun Li, Wenjie Huang, Chin Pang Ho

TL;DR
DRL-ORA introduces a novel framework for distributional reinforcement learning that dynamically adjusts risk levels online, improving safety and efficiency by unifying uncertainty quantification and risk adaptation.
Contribution
It proposes a unified framework for online risk adaptation in distributional RL, combining uncertainty quantification with dynamic risk level adjustment.
Findings
Outperforms fixed risk level methods in various tasks.
Provides better explainability and flexibility in risk management.
Efficient risk level selection via grid search and online optimization.
Abstract
One of the main challenges in reinforcement learning (RL) is that the agent has to make decisions that would influence the future performance without having complete knowledge of the environment. Dynamically adjusting the level of epistemic risk during the learning process can help to achieve reliable policies in safety-critical settings with better efficiency. In this work, we propose a new framework, Distributional RL with Online Risk Adaptation (DRL-ORA). This framework quantifies both epistemic and implicit aleatory uncertainties in a unified manner and dynamically adjusts the epistemic risk levels by solving a total variation minimization problem online. The framework unifies the existing variants of risk adaption approaches and offers better explainability and flexibility. The selection of risk levels is performed efficiently via a grid search using a Follow-The-Leader-type…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing
