Convex Is Back: Solving Belief MDPs With Convexity-Informed Deep   Reinforcement Learning

Daniel Koutas; Daniel Hettegger; Kostas G. Papakonstantinou; Daniel; Straub

arXiv:2502.09298·cs.LG·March 13, 2025

Convex Is Back: Solving Belief MDPs With Convexity-Informed Deep Reinforcement Learning

Daniel Koutas, Daniel Hettegger, Kostas G. Papakonstantinou, Daniel, Straub

PDF

Open Access 1 Repo

TL;DR

This paper introduces a convexity-aware deep reinforcement learning method for POMDPs, which improves agent performance and robustness by leveraging the convex property of the value function, demonstrated on benchmark environments.

Contribution

It proposes the first convexity-informed DRL approach for belief MDPs, incorporating hard and soft convexity constraints, and shows improved performance and robustness over standard methods.

Findings

01

Convexity-aware DRL outperforms standard DRL in POMDP benchmarks.

02

Including convexity increases robustness to hyperparameter variations.

03

Method generalizes well to out-of-distribution domains.

Abstract

We present a novel method for Deep Reinforcement Learning (DRL), incorporating the convex property of the value function over the belief space in Partially Observable Markov Decision Processes (POMDPs). We introduce hard- and soft-enforced convexity as two different approaches, and compare their performance against standard DRL on two well-known POMDP environments, namely the Tiger and FieldVisionRockSample problems. Our findings show that including the convexity feature can substantially increase performance of the agents, as well as increase robustness over the hyperparameter space, especially when testing on out-of-distribution domains. The source code for this work can be found at https://github.com/Dakout/Convex_DRL.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dakout/convex_drl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Reinforcement Learning in Robotics · Machine Learning and Algorithms