TL;DR
This paper introduces an online Baum-Welch algorithm for hierarchical imitation learning within the options framework, enabling scalable, real-time learning from expert demonstrations in both discrete and continuous environments.
Contribution
It presents a novel online EM-based algorithm for hierarchical imitation learning, improving scalability and performance over batch methods.
Findings
The online algorithm performs well in discrete environments.
It outperforms batch algorithms under certain conditions.
Effective in continuous environments.
Abstract
The options framework for hierarchical reinforcement learning has increased its popularity in recent years and has made improvements in tackling the scalability problem in reinforcement learning. Yet, most of these recent successes are linked with a proper options initialization or discovery. When an expert is available, the options discovery problem can be addressed by learning an options-type hierarchical policy directly from expert demonstrations. This problem is referred to as hierarchical imitation learning and can be handled as an inference problem in a Hidden Markov Model, which is done via an Expectation-Maximization type algorithm. In this work, we propose a novel online algorithm to perform hierarchical imitation learning in the options framework. Further, we discuss the benefits of such an algorithm and compare it with its batch version in classical reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
