Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning
Hanlin Yang, Jian Yao, Weiming Liu, Qing Wang, Hanmin Qin, Hansheng, Kong, Kirk Tang, Jiechao Xiong, Chao Yu, Kai Li, Junliang Xing, Hongwu Chen,, Juchao Zhuo, Qiang Fu, Yang Wei, Haobo Fu

TL;DR
This paper introduces a novel approach in imitation learning that weights state-action pairs by pointwise mutual information to better recover diverse policies from expert trajectories, emphasizing style-relevant data.
Contribution
It proposes a new weighted behavioral cloning method using pointwise mutual information to improve diversity and accuracy in policy recovery.
Findings
Enhanced policy recovery accuracy demonstrated in experiments.
Effective identification of style-relevant state-action pairs.
Theoretical justification supports the proposed weighting mechanism.
Abstract
Recovering a spectrum of diverse policies from a set of expert trajectories is an important research topic in imitation learning. After determining a latent style for a trajectory, previous diverse policies recovering methods usually employ a vanilla behavioral cloning learning objective conditioned on the latent style, treating each state-action pair in the trajectory with equal importance. Based on an observation that in many scenarios, behavioral styles are often highly relevant with only a subset of state-action pairs, this paper presents a new principled method in diverse polices recovery. In particular, after inferring or assigning a latent style for a trajectory, we enhance the vanilla behavioral cloning by incorporating a weighting mechanism based on pointwise mutual information. This additional weighting reflects the significance of each state-action pair's contribution to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
MethodsFocus · Sparse Evolutionary Training
