A Bayesian Approach to Policy Recognition and State Representation   Learning

Adrian \v{S}o\v{s}i\'c; Abdelhak M. Zoubir; Heinz Koeppl

arXiv:1605.01278·stat.ML·August 7, 2017

A Bayesian Approach to Policy Recognition and State Representation Learning

Adrian \v{S}o\v{s}i\'c, Abdelhak M. Zoubir, Heinz Koeppl

PDF

TL;DR

This paper introduces a Bayesian framework for learning from demonstration that models the full distribution of expert policies and infers the complexity of state representations without assuming optimality or deterministic behavior.

Contribution

It presents a Bayesian approach to policy recognition that handles stochastic expert behaviors and infers task-relevant state space partitionings in a nonparametric manner.

Findings

01

Successfully models the posterior distribution of expert policies.

02

Infers the complexity of state representations from demonstration data.

03

Learns task-specific state space partitionings.

Abstract

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.