FDPP: Fine-tune Diffusion Policy with Human Preference

Yuxin Chen; Devesh K. Jha; Masayoshi Tomizuka; Diego Romeres

arXiv:2501.08259·cs.RO·January 15, 2025

FDPP: Fine-tune Diffusion Policy with Human Preference

Yuxin Chen, Devesh K. Jha, Masayoshi Tomizuka, Diego Romeres

PDF

Open Access

TL;DR

FDPP introduces a method to adapt pre-trained robotic policies to new human preferences using preference-based reward learning and reinforcement learning, ensuring customization without losing original task performance.

Contribution

The paper presents FDPP, a novel approach combining preference-based reward learning with RL to fine-tune policies for personalized robotic manipulation.

Findings

01

FDPP effectively aligns policies with new human preferences.

02

Incorporating KL regularization prevents overfitting during fine-tuning.

03

FDPP maintains original task performance while customizing behavior.

Abstract

Imitation learning from human demonstrations enables robots to perform complex manipulation tasks and has recently witnessed huge success. However, these techniques often struggle to adapt behavior to new preferences or changes in the environment. To address these limitations, we propose Fine-tuning Diffusion Policy with Human Preference (FDPP). FDPP learns a reward function through preference-based learning. This reward is then used to fine-tune the pre-trained policy with reinforcement learning (RL), resulting in alignment of pre-trained policy with new human preferences while still solving the original task. Our experiments across various robotic tasks and preferences demonstrate that FDPP effectively customizes policy behavior without compromising performance. Additionally, we show that incorporating Kullback-Leibler (KL) regularization during fine-tuning prevents over-fitting and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsICT Impact and Policies · Merger and Competition Analysis

MethodsDiffusion