Inverse Reinforce Learning with Nonparametric Behavior Clustering

Siddharthan Rajasekaran; Jinwei Zhang; and Jie Fu

arXiv:1712.05514·cs.AI·December 18, 2017·2 cites

Inverse Reinforce Learning with Nonparametric Behavior Clustering

Siddharthan Rajasekaran, Jinwei Zhang, and Jie Fu

PDF

Open Access

TL;DR

This paper presents a non-parametric clustering approach to inverse reinforcement learning that simultaneously identifies multiple behavior patterns and their reward functions from diverse demonstrations, improving efficiency and handling behavioral heterogeneity.

Contribution

It introduces an iterative EM-based non-parametric clustering algorithm for IRL that efficiently learns multiple reward functions without solving separate IRL problems for each cluster.

Findings

01

Successfully clustered driver behaviors in grid-world and robot car simulations.

02

Demonstrated convergence and improved computational efficiency of the method.

03

Effectively inferred multiple reward functions from heterogeneous demonstrations.

Abstract

Inverse Reinforcement Learning (IRL) is the task of learning a single reward function given a Markov Decision Process (MDP) without defining the reward function, and a set of demonstrations generated by humans/experts. However, in practice, it may be unreasonable to assume that human behaviors can be explained by one reward function since they may be inherently inconsistent. Also, demonstrations may be collected from various users and aggregated to infer and predict user's behaviors. In this paper, we introduce the Non-parametric Behavior Clustering IRL algorithm to simultaneously cluster demonstrations and learn multiple reward functions from demonstrations that may be generated from more than one behaviors. Our method is iterative: It alternates between clustering demonstrations into different behavior clusters and inverse learning the reward functions until convergence. It is built…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Transportation and Mobility Innovations · Autonomous Vehicle Technology and Safety