Learning Explainable and Better Performing Representations of POMDP Strategies
Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky,, Stefanie Mohr

TL;DR
This paper introduces a scalable method to learn compact, explainable automaton representations of POMDP strategies using a modified L*-algorithm, which can also enhance strategy performance.
Contribution
It presents a novel, scalable approach to learn automaton-based strategies for POMDPs that are smaller, more explainable, and potentially more effective than existing methods.
Findings
Automaton representations are significantly smaller than tabular strategies.
The learned automata can improve strategy performance.
The method is more scalable than direct automaton synthesis from POMDPs.
Abstract
Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that synthesize an automaton directly from the POMDP thereby solving it, our approach is incomparably more scalable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Formal Methods in Verification
