TL;DR
This paper introduces a novel bounded value iteration algorithm for stochastic games that enhances convergence by combining local Bellman updates with global upper bound propagation through widest path computation, improving efficiency.
Contribution
The paper proposes a new BVI algorithm that integrates global upper bound propagation via widest path computation, reducing EC-related performance issues in stochastic game analysis.
Findings
The new algorithm outperforms previous BVI methods in experiments.
Global propagation reduces EC computation overhead.
Enhanced convergence speed in stochastic game solutions.
Abstract
Solving stochastic games with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, bounded value iteration (BVI) attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly end component (EC) computation that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, global propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighted graph and solve the widest path problem in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
