Disentangled Point Diffusion for Precise Object Placement

Lyuxing He; Eric Cai; Shobhit Aggarwal; Jianjun Wang; David Held

arXiv:2604.11793·cs.RO·April 14, 2026

Disentangled Point Diffusion for Precise Object Placement

Lyuxing He, Eric Cai, Shobhit Aggarwal, Jianjun Wang, David Held

PDF

TL;DR

This paper introduces TAX-DPD, a hierarchical, disentangled point diffusion framework that significantly improves object placement precision and generalization in robotic manipulation tasks, validated through simulation and real-world experiments.

Contribution

The paper presents a novel hierarchical point diffusion approach with disentangled modeling of object geometry and placement, achieving state-of-the-art accuracy and generalization.

Findings

01

Achieves higher placement accuracy than prior SE(3)-diffusion methods.

02

Demonstrates strong generalization to different object geometries and scene configurations.

03

Validates effectiveness on high-precision industrial and cloth-hanging tasks.

Abstract

Recent advances in robotic manipulation have highlighted the effectiveness of learning from demonstration. However, while end-to-end policies excel in expressivity and flexibility, they struggle both in generalizing to novel object geometries and in attaining a high degree of precision. An alternative, object-centric approach frames the task as predicting the placement pose of the target object, providing a modular decomposition of the problem. Building on this goal-prediction paradigm, we propose TAX-DPD, a hierarchical, disentangled point diffusion framework that achieves state-of-the-art performance in placement precision, multi-modal coverage, and generalization to variations in object geometries and scene configurations. We model global scene-level placements through a novel feed-forward Dense Gaussian Mixture Model (GMM) that yields a spatially dense prior over global placements;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.