Optimizing the Expected Mean Payoff in Energy Markov Decision Processes

Tom\'a\v{s} Br\'azdil; Anton\'in Ku\v{c}era; Petr Novotn\'y

arXiv:1607.00678·cs.LO·July 5, 2016

Optimizing the Expected Mean Payoff in Energy Markov Decision Processes

Tom\'a\v{s} Br\'azdil, Anton\'in Ku\v{c}era, Petr Novotn\'y

PDF

Open Access

TL;DR

This paper studies how to compute strategies in Energy Markov Decision Processes that ensure non-negative counters while maximizing the expected mean payoff, addressing a key challenge in stochastic decision-making with resource constraints.

Contribution

It introduces methods for computing safe strategies in EMDPs that optimize expected mean payoff while maintaining non-negative energy levels.

Findings

01

Developed algorithms for safe strategy computation in EMDPs.

02

Proved the existence of optimal strategies under safety constraints.

03

Provided complexity analysis of the optimization problem.

Abstract

Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Petri Nets in System Modeling · Advanced Battery Technologies Research