Computational Approaches for Stochastic Shortest Path on Succinct MDPs

Krishnendu Chatterjee; Hongfei Fu; Amir Kafshdar Goharshady; Nastaran; Okati

arXiv:1804.08984·cs.PL·July 18, 2018

Computational Approaches for Stochastic Shortest Path on Succinct MDPs

Krishnendu Chatterjee, Hongfei Fu, Amir Kafshdar Goharshady, Nastaran, Okati

PDF

TL;DR

This paper introduces computational methods for solving the stochastic shortest path problem in succinct MDPs, providing polynomial-time bounds and reductions applicable to infinite-state models, with experimental validation.

Contribution

The paper develops polynomial-time algorithms for upper and lower bounds on SSP in succinct MDPs, including reductions to quadratic programming, and demonstrates effectiveness on classical AI examples.

Findings

01

Polynomial-time algorithms for upper bounds in succinct MDPs

02

Polynomial-time reduction to quadratic programming for lower bounds

03

Effective in infinite-state MDPs with experimental validation

Abstract

We consider the stochastic shortest path (SSP) problem for succinct Markov decision processes (MDPs), where the MDP consists of a set of variables, and a set of nondeterministic rules that update the variables. First, we show that several examples from the AI literature can be modeled as succinct MDPs. Then we present computational approaches for upper and lower bounds for the SSP problem: (a)~for computing upper bounds, our method is polynomial-time in the implicit description of the MDP; (b)~for lower bounds, we present a polynomial-time (in the size of the implicit description) reduction to quadratic programming. Our approach is applicable even to infinite-state MDPs. Finally, we present experimental results to demonstrate the effectiveness of our approach on several classical examples from the AI literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.