Trace-Focused Diffusion Policy for Multi-Modal Action Disambiguation in Long-Horizon Robotic Manipulation

Yuxuan Hu; Xiangyu Chen; Chuhao Zhou; Yuxi Liu; Gen Li; Jindou Jia; Jianfei Yang

arXiv:2602.07388·cs.RO·February 10, 2026

Trace-Focused Diffusion Policy for Multi-Modal Action Disambiguation in Long-Horizon Robotic Manipulation

Yuxuan Hu, Xiangyu Chen, Chuhao Zhou, Yuxi Liu, Gen Li, Jindou Jia, Jianfei Yang

PDF

Open Access

TL;DR

This paper introduces TF-DP, a diffusion-based policy that conditions on execution history to disambiguate actions in long-horizon robotic tasks, significantly improving robustness and consistency in visually complex environments.

Contribution

The paper proposes a trace-focused diffusion policy that explicitly incorporates execution history, addressing multi-modal action ambiguity in long-horizon robotic manipulation.

Findings

01

80.56% improvement over vanilla diffusion policy in ambiguous tasks

02

86.11% robustness increase under visual disturbances

03

Only 6.4% increase in inference runtime

Abstract

Generative model-based policies have shown strong performance in imitation-based robotic manipulation by learning action distributions from demonstrations. However, in long-horizon tasks, visually similar observations often recur across execution stages while requiring distinct actions, which leads to ambiguous predictions when policies are conditioned only on instantaneous observations, termed multi-modal action ambiguity (MA2). To address this challenge, we propose the Trace-Focused Diffusion Policy (TF-DP), a simple yet effective diffusion-based framework that explicitly conditions action generation on the robot's execution history. TF-DP represents historical motion as an explicit execution trace and projects it into the visual observation space, providing stage-aware context when current observations alone are insufficient. In addition, the induced trace-focused field emphasizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications