Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression
Ziqi Zhang, Zifeng Zhuang, Jingzehua Xu, Yiyuan Yang, Yubo Huang,, Donglin Wang, Shuai Zhang

TL;DR
This paper introduces Adversarial Density Regression (ADR), a novel one-step supervised imitation learning framework that effectively corrects policies trained on imperfect demonstrations to match expert distributions without relying on the Bellman operator.
Contribution
The paper presents ADR, a new IL method combining density-weighted behavioral cloning with theoretical guarantees, addressing limitations of previous algorithms like OOD issues and reliance on multi-step updates.
Findings
ADR outperforms existing IL algorithms on Gym-Mujoco tasks.
ADR achieves 89.5% improvement over IQL with ground truth rewards on Adroit and Kitchen tasks.
Theoretical analysis shows ADR effectively aligns policy distribution with expert distribution.
Abstract
We propose a novel one-step supervised imitation learning (IL) framework called Adversarial Density Regression (ADR). This IL framework aims to correct the policy learned on unknown-quality to match the expert distribution by utilizing demonstrations, without relying on the Bellman operator. Specifically, ADR addresses several limitations in previous IL algorithms: First, most IL algorithms are based on the Bellman operator, which inevitably suffer from cumulative offsets from sub-optimal rewards during multi-step update processes. Additionally, off-policy training frameworks suffer from Out-of-Distribution (OOD) state-actions. Second, while conservative terms help solve the OOD issue, balancing the conservative term is difficult. To address these limitations, we fully integrate a one-step density-weighted Behavioral Cloning (BC) objective for IL with auxiliary imperfect demonstration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Fault Detection and Control Systems
MethodsALIGN · Implicit Q-Learning
