Omega-Regular Decision Processes
Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh, Trivedi, Dominik Wojtczak

TL;DR
This paper introduces omega-regular decision processes (ODPs), extending regular decision processes with omega-regular lookaheads, enabling more expressive non-Markovian decision-making with effective optimization and learning methods.
Contribution
The paper proposes omega-regular decision processes (ODPs), extending RDPs with omega-regular lookaheads and providing a reduction to finite MDPs for optimization and learning.
Findings
Effective reduction of ODPs to finite MDPs for optimization.
Experimental validation demonstrating the approach's effectiveness.
Abstract
Regular decision processes (RDPs) are a subclass of non-Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct representation of non-Markovian decision processes, their expressive power coincides with finite-state Markov decision processes (MDPs). We introduce omega-regular decision processes (ODPs) where the non-Markovian aspect of the transition and reward functions are extended to an omega-regular lookahead over the system evolution. Semantically, these lookaheads can be considered as promises made by the decision maker or the learning agent about her future behavior. In particular, we assume that, if the promised lookaheads are not met, then the payoff to the decision maker is (least desirable payoff), overriding any rewards collected by the decision maker.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms
