Policy Synthesis for Interval MDPs via Polyhedral Lyapunov Functions
Negar Monir, Sadegh Soudjani

TL;DR
This paper presents a new method for synthesizing policies in interval MDPs using polyhedral Lyapunov functions, improving accuracy and efficiency in handling uncertainties for safety-critical decision-making.
Contribution
It introduces a polyhedral Lyapunov function-based approach for policy synthesis in interval MDPs, avoiding complex Pareto computations and providing convergence guarantees.
Findings
Effective policy synthesis demonstrated on recycling robot case study.
Method reduces computational complexity compared to previous approaches.
Numerical results show improved handling of uncertainties in decision-making.
Abstract
Decision-making under uncertainty is central to many safety-critical applications, where decisions must be guided by probabilistic modeling formalisms. This paper introduces a novel approach to policy synthesis in multi-objective interval Markov decision processes using polyhedral Lyapunov functions. Unlike previous Lyapunov-based methods that mainly rely on quadratic functions, our method utilizes polyhedral functions to enhance accuracy in managing uncertainties within value iteration of dynamic programming. We reformulate the value iteration algorithm as a switched affine system with interval uncertainties and apply control-theoretic stability principles to synthesize policies that guide the system toward a desired target set. By constructing an invariant set of attraction, we ensure that the synthesized policies provide convergence guarantees while minimizing the impact of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Numerical Methods and Algorithms
