State Diversity Matters in Offline Behavior Distillation

Shiye Lei; Zhihao Cheng; Dacheng Tao

arXiv:2512.06692·cs.LG·December 9, 2025

State Diversity Matters in Offline Behavior Distillation

Shiye Lei, Zhihao Cheng, Dacheng Tao

PDF

Open Access

TL;DR

This paper demonstrates that emphasizing state diversity in Offline Behavior Distillation improves policy performance, especially with limited original dataset diversity, by analyzing the roles of state quality and diversity.

Contribution

It uncovers the importance of state diversity in OBD, provides theoretical insights into error reduction, and introduces a simple density-weighted algorithm to enhance distillation.

Findings

01

State diversity correlates with better policy performance under high training loss.

02

The proposed SDW algorithm improves distillation results on datasets with limited diversity.

03

Theoretical analysis highlights the dominance of surrounding error over pivotal error in certain scenarios.

Abstract

Offline Behavior Distillation (OBD), which condenses massive offline RL data into a compact synthetic behavioral dataset, offers a promising approach for efficient policy training and can be applied across various downstream RL tasks. In this paper, we uncover a misalignment between original and distilled datasets, observing that a high-quality original dataset does not necessarily yield a superior synthetic dataset. Through an empirical analysis of policy performance under varying levels of training loss, we show that datasets with greater state diversity outperforms those with higher state quality when training loss is substantial, as is often the case in OBD, whereas the relationship reverses under minimal loss, which contributes to the misalignment. By associating state quality and diversity in reducing pivotal and surrounding error, respectively, our theoretical analysis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Machine Learning and Data Classification