Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner

Chenyou Fan; Chenjia Bai; Zhao Shan; Haoran He; Yang Zhang; Zhen Wang

arXiv:2409.19949·cs.LG·July 15, 2025

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner

Chenyou Fan, Chenjia Bai, Zhao Shan, Haoran He, Yang Zhang, Zhen Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces SODP, a two-stage diffusion planning framework that pre-trains on large-scale, sub-optimal multi-task data and fine-tunes with task-specific rewards, enabling versatile and efficient task adaptation.

Contribution

It presents a novel two-stage approach combining pre-training on sub-optimal data and RL-based fine-tuning for multi-task diffusion planning.

Findings

01

Outperforms state-of-the-art methods in multi-task domains.

02

Requires less data for reward-guided fine-tuning.

03

Effectively leverages sub-optimal trajectories for generalizable planning.

Abstract

Diffusion models have demonstrated their capabilities in modeling trajectories of multi-tasks. However, existing multi-task planners or policies typically rely on task-specific demonstrations via multi-task imitation, or require task-specific reward labels to facilitate policy optimization via Reinforcement Learning (RL). They are costly due to the substantial human efforts required to collect expert data or design reward functions. To address these challenges, we aim to develop a versatile diffusion planner capable of leveraging large-scale inferior data that contains task-agnostic sub-optimal trajectories, with the ability to fast adapt to specific tasks. In this paper, we propose SODP, a two-stage framework that leverages Sub-Optimal data to learn a Diffusion Planner, which is generalizable for various downstream tasks. Specifically, in the pre-training stage, we train a foundation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner· slideslive

Taxonomy

TopicsManufacturing Process and Optimization

MethodsDiffusion