Lexicographic Multi-Objective Stochastic Shortest Path with Mixed Max-Sum Costs
Zhiquan Zhang, Omar Muhammetkulyyev, Tichakorn Wongpiromsarn, Melkior Ornik

TL;DR
This paper introduces a novel approach for stochastic shortest path problems with max-aggregation costs, addressing safety-critical constraints by combining max and sum objectives through a lexicographic value iteration method.
Contribution
It develops a new framework for mixed max-sum cost objectives in SSPs, including an augmented MDP, a finite-horizon solution for cyclic policies, and a lexicographic algorithm for LTL specifications.
Findings
Effective handling of bottleneck and cumulative costs in SSPs.
Successful application to gridworld case studies.
Resolution of zero-marginal-cost cycle issues.
Abstract
We study the Stochastic Shortest Path (SSP) problem for autonomous systems with mixed max-sum cost aggregations under Linear Temporal Logic constraints. Classical SSP formulations rely on sum-aggregated costs, which are suitable for cumulative quantities such as time or energy but fail to capture bottleneck-style objectives such as avoiding high-risk transitions, where performance is determined by the worst single event along a trajectory. Such objectives are particularly important in safety-critical systems, where even one hazardous transition can be unacceptable. To address this limitation, we introduce max-aggregated objectives that minimize the bottleneck cost, i.e., the maximum one-step cost along a trajectory. We show that standard Bellman equations on the original state space do not apply in this setting and propose an augmented MDP with a state variable tracking the running…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Petri Nets in System Modeling · Reinforcement Learning in Robotics
