Inverse Reinforce Learning with Nonparametric Behavior Clustering
Siddharthan Rajasekaran, Jinwei Zhang, and Jie Fu

TL;DR
This paper presents a non-parametric clustering approach to inverse reinforcement learning that simultaneously identifies multiple behavior patterns and their reward functions from diverse demonstrations, improving efficiency and handling behavioral heterogeneity.
Contribution
It introduces an iterative EM-based non-parametric clustering algorithm for IRL that efficiently learns multiple reward functions without solving separate IRL problems for each cluster.
Findings
Successfully clustered driver behaviors in grid-world and robot car simulations.
Demonstrated convergence and improved computational efficiency of the method.
Effectively inferred multiple reward functions from heterogeneous demonstrations.
Abstract
Inverse Reinforcement Learning (IRL) is the task of learning a single reward function given a Markov Decision Process (MDP) without defining the reward function, and a set of demonstrations generated by humans/experts. However, in practice, it may be unreasonable to assume that human behaviors can be explained by one reward function since they may be inherently inconsistent. Also, demonstrations may be collected from various users and aggregated to infer and predict user's behaviors. In this paper, we introduce the Non-parametric Behavior Clustering IRL algorithm to simultaneously cluster demonstrations and learn multiple reward functions from demonstrations that may be generated from more than one behaviors. Our method is iterative: It alternates between clustering demonstrations into different behavior clusters and inverse learning the reward functions until convergence. It is built…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Transportation and Mobility Innovations · Autonomous Vehicle Technology and Safety
