Dynamic Programming for Structured Continuous Markov Decision Problems

Zhengzhu Feng; Richard Dearden; Nicolas Meuleau; Richard Washington

arXiv:1207.4115·cs.AI·July 19, 2012

Dynamic Programming for Structured Continuous Markov Decision Problems

Zhengzhu Feng, Richard Dearden, Nicolas Meuleau, Richard Washington

PDF

Open Access

TL;DR

This paper introduces a dynamic programming approach that exploits structure in continuous state Markov Decision Processes by partitioning the state space, enabling efficient computation of optimal solutions for complex problems.

Contribution

It presents a novel method for dynamic programming in structured continuous MDPs using piecewise constant and linear value function representations, extending techniques from POMDPs.

Findings

01

Efficient computation of optimal policies in structured continuous MDPs.

02

The approach handles complex, structured problems effectively.

03

Exploits natural problem structure for computational gains.

Abstract

We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamically partitioned into regions where the value function is the same throughout the region. We first describe the algorithm for piecewise constant representations. We then extend it to piecewise linear representations, using techniques from POMDPs to represent and reason about linear surfaces efficiently. We show that for complex, structured problems, our approach exploits the natural structure so that optimal solutions can be computed efficiently.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Bayesian Modeling and Causal Inference · Reinforcement Learning in Robotics