ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors
Zifan Xu, Ran Gong, Maria Vittoria Minniti, Ahmet Salih Gundogdu, Eric Rosen, Kausik Sivakumar, Riedana Yan, Zixing Wang, Di Deng, Peter Stone, Xiaohan Zhang, Karl Schmeckpeper

TL;DR
ExpertGen is a scalable framework that learns expert policies in simulation from imperfect demonstrations, enabling effective sim-to-real transfer for robotic manipulation tasks.
Contribution
It introduces a diffusion policy-based behavior prior and reinforcement learning to improve scalable, safe, and high-quality policy learning from imperfect data.
Findings
Achieves over 90% success on industrial assembly tasks.
Attains 85% success on long-horizon manipulation benchmarks.
Successfully transfers policies from simulation to real robots.
Abstract
Learning generalizable and robust behavior cloning policies requires large volumes of high-quality robotics data. While human demonstrations (e.g., through teleoperation) serve as the standard source for expert behaviors, acquiring such data at scale in the real world is prohibitively expensive. This paper introduces ExpertGen, a framework that automates expert policy learning in simulation to enable scalable sim-to-real transfer. ExpertGen first initializes a behavior prior using a diffusion policy trained on imperfect demonstrations, which may be synthesized by large language models or provided by humans. Reinforcement learning is then used to steer this prior toward high task success by optimizing the diffusion model's initial noise while keep original policy frozen. By keeping the pretrained diffusion policy frozen, ExpertGen regularizes exploration to remain within safe, human-like…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
