Enhancing Inverse Reinforcement Learning through Encoding Dynamic Information in Reward Shaping

Simon Sinong Zhan; Philip Wang; Qingyuan Wu; Yixuan Wang; Ruochen Jiao; Chao Huang; Qi Zhu

arXiv:2410.03847·cs.LG·February 12, 2026

Enhancing Inverse Reinforcement Learning through Encoding Dynamic Information in Reward Shaping

Simon Sinong Zhan, Philip Wang, Qingyuan Wu, Yixuan Wang, Ruochen Jiao, Chao Huang, Qi Zhu

PDF

Open Access

TL;DR

This paper introduces a Model-Enhanced AIRL framework that incorporates dynamic information into reward shaping, improving performance and sample efficiency in stochastic environments.

Contribution

It proposes a novel reward shaping method that integrates transition model estimation into AIRL, with theoretical guarantees and improved empirical results.

Findings

01

Achieves superior performance in stochastic environments

02

Improves sample efficiency significantly

03

Maintains competitive results in deterministic environments

Abstract

In this paper, we aim to tackle the limitation of the Adversarial Inverse Reinforcement Learning (AIRL) method in stochastic environments where theoretical results cannot hold and performance is degraded. To address this issue, we propose a novel method which infuses the dynamics information into the reward shaping with the theoretical guarantee for the induced optimal policy in the stochastic environments. Incorporating our novel model-enhanced rewards, we present a novel Model-Enhanced AIRL framework, which integrates transition model estimation directly into reward shaping. Furthermore, we provide a comprehensive theoretical analysis of the reward error bound and performance difference bound for our method. The experimental results in MuJoCo benchmarks show that our method can achieve superior performance in stochastic environments and competitive performance in deterministic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning