Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes
Aaron Bohy (Universit\'e de Mons), V\'eronique Bruy\`ere (Universit\'e, de Mons), Jean-Fran\c{c}ois Raskin (Universit\'e Libre de Bruxelles)

TL;DR
This paper introduces pseudo-antichain data structures to improve symblicit algorithms for optimal strategy synthesis in monotonic MDPs, demonstrating efficiency in runtime and memory for practical applications.
Contribution
It proposes pseudo-antichains as an alternative data structure for symblicit algorithms, enhancing performance in monotonic MDPs for mean-payoff and shortest path problems.
Findings
Pseudo-antichains improve efficiency in strategy synthesis.
Algorithms show promising results in automated planning and LTL synthesis.
Memory consumption is reduced compared to previous methods.
Abstract
When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
