Density Matching Reward Learning
Sungjoon Choi, Kyungjae Lee, Andy Park, Songhwai Oh

TL;DR
This paper introduces a novel model-free density-based inverse reinforcement learning algorithm called DMRL, which infers expert reward functions without requiring model dynamics, and extends it to nonlinear problems with kernel methods, demonstrating superior performance in experiments.
Contribution
The paper proposes DMRL, a model-free density matching IRL algorithm, and extends it to nonlinear reward functions using kernel methods, with analytical parameter computation and extensive empirical evaluation.
Findings
KDMRL outperforms existing IRL methods in nonlinear reward settings.
KDMRL achieves competitive results in linear reward scenarios.
KDMRL effectively learns driving styles in complex, dynamic environments.
Abstract
In this paper, we focus on the problem of inferring the underlying reward function of an expert given demonstrations, which is often referred to as inverse reinforcement learning (IRL). In particular, we propose a model-free density-based IRL algorithm, named density matching reward learning (DMRL), which does not require model dynamics. The performance of DMRL is analyzed theoretically and the sample complexity is derived. Furthermore, the proposed DMRL is extended to handle nonlinear IRL problems by assuming that the reward function is in the reproducing kernel Hilbert space (RKHS) and kernel DMRL (KDMRL) is proposed. The parameters for KDMRL can be computed analytically, which greatly reduces the computation time. The performance of KDMRL is extensively evaluated in two sets of experiments: grid world and track driving experiments. In grid world experiments, the proposed KDMRL method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
