Contributions on complexity bounds for Deterministic Partially Observed Markov Decision Process

Cyrille Vessaire (CERMICS); Jean-Philippe Chancelier (CERMICS); Michel de Lara (CERMICS); Pierre Carpentier (OC); Alejandro Rodr\'iguez-Mart\'inez ([Total Energies. Anciennement: Total; TotalFina; TotalFinaElf])

arXiv:2301.08567·math.OC·September 24, 2025

Contributions on complexity bounds for Deterministic Partially Observed Markov Decision Process

Cyrille Vessaire (CERMICS), Jean-Philippe Chancelier (CERMICS), Michel de Lara (CERMICS), Pierre Carpentier (OC), Alejandro Rodr\'iguez-Mart\'inez ([Total Energies. Anciennement: Total, TotalFina, TotalFinaElf])

PDF

Open Access

TL;DR

This paper investigates the complexity bounds of deterministic subclasses of partially observed Markov decision processes, introducing simpler classes that mitigate the curse of dimensionality and improve existing bounds.

Contribution

It improves existing complexity bounds for Det-Pomdp and introduces Separated Det-Pomdp, a simpler subclass with better computational properties.

Findings

01

Improved complexity bounds for Det-Pomdp.

02

Introduction of Separated Det-Pomdp with reduced complexity.

03

Analysis of how deterministic structures mitigate the curse of dimensionality.

Abstract

Markov Decision Processes (Mdps) form a versatile framework used to model a wide range of optimization problems. The Mdp model consists of sets of states, actions, time steps, rewards, and probability transitions. When in a given state and at a given time, the decision maker's action generates a reward and determines the state at the next time step according to the probability transition function. However, Mdps assume that the decision maker knows the state of the controlled dynamical system. Hence, when one needs to optimize controlled dynamical systems under partial observation, one often turns toward the formalism of Partially Observed Markov Decision Processes (Pomdp). Pomdps are often untractable in the general case as Dynamic Programming suffers from the curse of dimensionality. Instead of focusing on the general Pomdps, we present a subclass where transitions and observations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Formal Methods in Verification