Risk-Averse Planning Under Uncertainty
Mohamadreza Ahmadi, Masahiro Ono, Michel D. Ingham, Richard M. Murray,, and Aaron D. Ames

TL;DR
This paper introduces a method for designing risk-averse policies in POMDPs using bounded policy iteration and convex optimization, enabling finite-memory solutions for complex decision-making under uncertainty.
Contribution
It proposes a novel bounded policy iteration approach for risk-averse POMDPs that produces finite-memory controllers, addressing the undecidability of infinite-memory policy synthesis.
Findings
Effective finite-memory controllers for risk-averse POMDPs
Sub-optimal solutions with reduced coherent risk
Utilizes convex optimization for policy design
Abstract
We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
