PD$^{2}$GS: Part-Level Decoupling and Continuous Deformation of Articulated Objects via Gaussian Splatting

Haowen Wang; Xiaoping Yuan; Zhao Jin; Zhen Zhao; Zhengping Che; Yousong Xue; Jin Tian; Yakun Huang; Jian Tang

arXiv:2506.09663·cs.CV·March 3, 2026

PD$^{2}$GS: Part-Level Decoupling and Continuous Deformation of Articulated Objects via Gaussian Splatting

Haowen Wang, Xiaoping Yuan, Zhao Jin, Zhen Zhao, Zhengping Che, Yousong Xue, Jin Tian, Yakun Huang, Jian Tang

PDF

Open Access 3 Reviews

TL;DR

PD$^{2}$GS introduces a novel framework for modeling articulated objects by learning a shared Gaussian field and representing interaction states as continuous deformations, enabling accurate part-level decoupling and smooth control without manual supervision.

Contribution

It proposes a unified approach that encodes geometry and kinematics jointly, refines part boundaries with vision priors, and supports continuous control and accurate modeling of articulated objects.

Findings

01

Outperforms prior methods in geometric accuracy

02

Achieves superior kinematic modeling and control consistency

03

Demonstrates effectiveness on both synthetic and real datasets

Abstract

Articulated objects are ubiquitous and important in robotics, AR/VR, and digital twins. Most self-supervised methods for articulated object modeling reconstruct discrete interaction states and relate them via cross-state geometric consistency, yielding representational fragmentation and drift that hinder smooth control of articulated configurations. We introduce PD $^{2}$ GS, a novel framework that learns a shared canonical Gaussian field and models the arbitrary interaction state as its continuous deformation, jointly encoding geometry and kinematics. By associating each interaction state with a latent code and refining part boundaries using generic vision priors, PD $^{2}$ GS enables accurate and reliable part-level decoupling while enforcing mutual exclusivity between parts and preserving scene-level coherence. This unified formulation supports part-aware reconstruction, fine-grained…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The paper introduces a unified framework that models articulated objects through continuous deformations of a shared canonical Gaussian field, effectively addressing the fragmentation and drift issues inherent in previous discrete-state reconstruction methods. 2. The method achieves part-level decoupling without manual supervision by leveraging generic vision priors and latent code associations, enabling fine-grained continuous control over articulated configurations. 3. The paper contrib

Weaknesses

1. The reconstruction results exhibit excessive noise, particularly evident in the real-world examples shown in Figure 13, which raises concerns about the method's robustness in practical scenarios. 2. In Section 3.2 on deformable Gaussian splatting, the methodology bears strong similarity to existing 4DGS works such as [a], yet these related approaches are not cited or discussed. 3. The paper does not provide information about inference time per sample, which would be valuable for understandi

Reviewer 02Rating 6Confidence 4

Strengths

- Technical contribution: the paper proposes a conceptually elegant unification of geometry and kinematics via continuous deformation of a canonical Gaussian field. Coarse-to-fine segmentation combining motion trajectories with SAM-driven boundary refinement is both novel and effective. - RS-Art dataset is a meaningful contribution, bridging synthetic–real gaps with paired RGB-D data and 3D models. - Comprehensive experiments on an expanded PartNet-Mobility split and the new dataset de

Weaknesses

- Pipeline is complex and involves many heuristic components, which limited the scalability of the method. - The method proposed in the paper seems to require multiple states, which puts forward more requirements for the data curation. Furthermore, ensuring that the camera coordinate systems of all states are aligned is a challenge. Outside the laboratory environment, such as in simple home scenarios, it is difficult for us to obtain states with multiple coordinate systems aligned, and the er

Reviewer 03Rating 6Confidence 4

Strengths

1. The newly proposed dataset RS-Art should be useful for further research work if made public, especially those real-world captures. 2. The paper seems to achieve SOTA performance than baselines with multi-state multi-view images in most cases. 3. The authors conducted extensive experiments on different datasets.

Weaknesses

1. The whole systems seem to compose of numerous parts, which may be a little complicate and hard to extend. 2. Some visualizations on the newly-proposed dataset, including the data itself and the reconstructed results in videos would help readers grasp the new dataset. 3. The proposed method seem to be a little incremental though it achieves the best performance in most cases. It didn't deal with physical plausibility like 3D penetration. Its setting is also not unique as the main difference wi

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Robot Manipulation and Learning · Human Pose and Action Recognition