Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations
Finale Doshi-Velez, George Konidaris

TL;DR
This paper introduces the HiP-MDP framework that models related dynamical systems with latent parameters and uses semiparametric regression to quickly adapt to new task variations in control applications.
Contribution
It presents a novel framework combining latent parameter modeling with semiparametric regression for efficient task adaptation in control systems.
Findings
Rapid identification of new task dynamics
Effective modeling of related dynamical systems
Flexible adaptation to task variations
Abstract
Control applications often feature tasks with similar, but not identical, dynamics. We introduce the Hidden Parameter Markov Decision Process (HiP-MDP), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent factors, and introduce a semiparametric regression approach for learning its structure from data. In the control setting, we show that a learned HiP-MDP rapidly identifies the dynamics of a new task instance, allowing an agent to flexibly adapt to task variations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics · Data Stream Mining Techniques
