Going Beyond Expert Performance via Deep Implicit Imitation Reinforcement Learning
Iason Chrysomallis, Georgios Chalkiadakis

TL;DR
This paper introduces a deep implicit imitation reinforcement learning framework that enables agents to learn from observation-only datasets, surpass suboptimal expert performance, and handle different action spaces, demonstrated by significant empirical improvements.
Contribution
The paper presents the DIIQN algorithm for imitation from observations and extends it with HA-DIIQN to manage heterogeneous action spaces, addressing key limitations in prior methods.
Findings
DIIQN achieves up to 130% higher episodic returns than standard DQN.
DIIQN outperforms existing implicit imitation methods that can't surpass expert performance.
HA-DIIQN learns up to 64% faster in heterogeneous action scenarios.
Abstract
Imitation learning traditionally requires complete state-action demonstrations from optimal or near-optimal experts. These requirements severely limit practical applicability, as many real-world scenarios provide only state observations without corresponding actions and expert performance is often suboptimal. In this paper we introduce a deep implicit imitation reinforcement learning framework that addresses both limitations by combining deep reinforcement learning with implicit imitation learning from observation-only datasets. Our main algorithm, Deep Implicit Imitation Q-Network (DIIQN), employs an action inference mechanism that reconstructs expert actions through online exploration and integrates a dynamic confidence mechanism that adaptively balances expert-guided and self-directed learning. This enables the agent to leverage expert guidance for accelerated training while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
