SADP: Subgoal-Aware Diffusion Policy for Explainable Robots Learned from Foundation Model Generated Demonstrations

Site Hu; Takato Horii

arXiv:2605.16871·cs.RO·May 19, 2026

SADP: Subgoal-Aware Diffusion Policy for Explainable Robots Learned from Foundation Model Generated Demonstrations

Site Hu, Takato Horii

PDF

TL;DR

This paper introduces SADP, a diffusion policy framework that uses foundation models to generate subgoal-annotated demonstrations, enabling robots to explain their decision process and improve task success.

Contribution

SADP is the first framework to incorporate subgoal-aware diffusion policies trained on foundation model-generated demonstrations for explainable robot manipulation.

Findings

01

SADP outperforms task-conditioned baselines in success rates.

02

Provides subgoal-level signals for progress monitoring.

03

Achieves high task success with built-in interpretability.

Abstract

Explainable robots require not only successful task execution but also the ability to expose internal decision-making process in a user-friendly manner. However, most imitation learning methods are trained solely on task-level demonstrations, without explicitly modeling subgoal structure or execution progress. This limitation is further exacerbated by the scarcity of subgoal-level supervision in standard robot learning datasets, which restricts the development of robots that can convey the subtasks they are executing during long-horizon manipulation. To address this issue, this paper proposes Subgoal-Aware Diffusion Policy (SADP), a framework that leverages foundation models to autonomously generate subgoal-annotated demonstrations and trains diffusion policies on these datasets. SADP structures policy execution around human-interpretable subgoals by conditioning action generation on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.