Learning Explainable and Better Performing Representations of POMDP   Strategies

Alexander Bork; Debraj Chakraborty; Kush Grover; Jan Kretinsky,; Stefanie Mohr

arXiv:2401.07656·cs.AI·October 3, 2024·2 cites

Learning Explainable and Better Performing Representations of POMDP Strategies

Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky,, Stefanie Mohr

PDF

Open Access

TL;DR

This paper introduces a scalable method to learn compact, explainable automaton representations of POMDP strategies using a modified L*-algorithm, which can also enhance strategy performance.

Contribution

It presents a novel, scalable approach to learn automaton-based strategies for POMDPs that are smaller, more explainable, and potentially more effective than existing methods.

Findings

01

Automaton representations are significantly smaller than tabular strategies.

02

The learned automata can improve strategy performance.

03

The method is more scalable than direct automaton synthesis from POMDPs.

Abstract

Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that synthesize an automaton directly from the POMDP thereby solving it, our approach is incomparably more scalable.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Formal Methods in Verification