Conditional Kernel Imitation Learning for Continuous State Environments

Rishabh Agrawal; Nathan Dahlin; Rahul Jain; Ashutosh Nayyar

arXiv:2308.12573·cs.LG·August 25, 2023

Conditional Kernel Imitation Learning for Continuous State Environments

Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar

PDF

Open Access

TL;DR

This paper introduces a novel imitation learning framework for continuous environments that estimates transition dynamics with kernel methods and satisfies balance equations, outperforming existing algorithms without environment interaction.

Contribution

It proposes a conditional kernel density estimation-based IL method that leverages the Markov balance equation, providing consistency and superior empirical results.

Findings

01

Outperforms state-of-the-art IL algorithms in benchmark environments

02

Estimates transition dynamics accurately using kernel methods

03

Satisfies probabilistic balance equations asymptotically

Abstract

Imitation Learning (IL) is an important paradigm within the broader reinforcement learning (RL) methodology. Unlike most of RL, it does not assume availability of reward-feedback. Reward inference and shaping are known to be difficult and error-prone methods particularly when the demonstration data comes from human experts. Classical methods such as behavioral cloning and inverse reinforcement learning are highly sensitive to estimation errors, a problem that is particularly acute in continuous state space problems. Meanwhile, state-of-the-art IL algorithms convert behavioral policy learning problems into distribution-matching problems which often require additional online interaction data to be effective. In this paper, we consider the problem of imitation learning in continuous state space environments based solely on observed behavior, without access to transition dynamics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics