Hierarchical POMDP Controller Optimization by Likelihood Maximization
Marc Toussaint, Laurent Charlin, Pascal Poupart

TL;DR
This paper introduces a scalable method for hierarchical planning in partially observable domains by framing hierarchy discovery as a maximum likelihood estimation problem, improving over previous non-convex optimization approaches.
Contribution
It presents a novel maximum likelihood-based approach for hierarchy discovery in POMDPs, enabling more scalable and efficient planning in complex environments.
Findings
The method scales better than previous non-convex optimization techniques.
Experimental results show improved efficiency in hierarchy discovery.
The approach effectively integrates hierarchy learning with policy optimization.
Abstract
Planning can often be simpli ed by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational di culty of solving such an optimization problem makes it hard to scale to realworld problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximumlikelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique rst transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Fault Detection and Control Systems
