RAPTOR: End-to-end Risk-Aware MDP Planning and Policy Learning by   Backpropagation

Noah Patton; Jihwan Jeong; Michael Gimelfarb; Scott Sanner

arXiv:2106.07260·cs.LG·June 15, 2021

RAPTOR: End-to-end Risk-Aware MDP Planning and Policy Learning by Backpropagation

Noah Patton, Jihwan Jeong, Michael Gimelfarb, Scott Sanner

PDF

Open Access

TL;DR

RAPTOR introduces a risk-aware planning framework that optimizes the entropic utility in stochastic MDPs using backpropagation, enabling risk-sensitive decision-making in complex environments.

Contribution

It presents a novel reparameterization technique allowing end-to-end risk-aware optimization via backpropagation in stochastic environments.

Findings

01

Successfully applied to navigation, HVAC, and reservoir control domains.

02

Demonstrates effective risk management in highly stochastic MDPs.

03

Outperforms risk-agnostic methods in complex scenarios.

Abstract

Planning provides a framework for optimizing sequential decisions in complex environments. Recent advances in efficient planning in deterministic or stochastic high-dimensional domains with continuous action spaces leverage backpropagation through a model of the environment to directly optimize actions. However, existing methods typically not take risk into account when optimizing in stochastic domains, which can be incorporated efficiently in MDPs by optimizing the entropic utility of returns. We bridge this gap by introducing Risk-Aware Planning using PyTorch (RAPTOR), a novel framework for risk-sensitive planning through end-to-end optimization of the entropic utility objective. A key technical difficulty of our approach lies in that direct optimization of the entropic utility by backpropagation is impossible due to the presence of environment stochasticity. The novelty of RAPTOR…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReservoir Engineering and Simulation Methods · Water resources management and optimization · Machine Learning and Algorithms