Accelerating Policy Synthesis in Large-Scale MDPs via Hierarchical Adaptive Refinement
Alexandros Evangelidis, Gricel V\'azquez, Simos Gerasimou

TL;DR
This paper presents a hierarchical adaptive refinement method to accelerate policy synthesis in large-scale MDPs, balancing accuracy and efficiency, and demonstrating significant speedups over existing tools.
Contribution
It introduces a dynamic refinement approach that selectively improves large MDPs, providing near-optimal policies with bounded error and improved computational performance.
Findings
Achieves up to 2x speedup over PRISM in large MDPs.
Provides near-optimal policies with bounded error.
Demonstrates effectiveness on MDPs up to 1 million states.
Abstract
Software-intensive systems, such as software product lines and robotics, utilise Markov decision processes (MDPs) to capture uncertainty and analyse sequential decision-making problems. Despite the usefulness of conventional policy synthesis methods, they fail to scale to large state spaces. Our approach addresses this issue and accelerates policy synthesis in large MDPs by dynamically refining the MDP and iteratively selecting the most fragile MDP regions for refinement. This iterative procedure offers a balance between accuracy and efficiency, as refinement occurs only when necessary. We formally show that the composed policy is near-optimal under standard assumptions, with error bounded by the local solver tolerance and boundary mismatch. Across diverse case studies and MDPs up to 1M states, we demonstrate that our approach achieves up to speedup over PRISM, offering a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
