3D Part Assembly Generation with Instance Encoded Transformer
Rufeng Zhang, Tao Kong, Weihao Wang, Xuan Han, Mingyu You

TL;DR
This paper introduces a transformer-based framework for 3D furniture assembly that uses instance encoding for part differentiation, achieving significant improvements in pose estimation accuracy.
Contribution
It presents a novel multi-layer transformer model with unique instance encoding for 6-DoF part pose estimation in furniture assembly, including an extension to in-process assembly tasks.
Findings
Over 10% improvement over state-of-the-art methods
Effective geometric and relational reasoning between parts
Successful extension to in-process assembly scenarios
Abstract
It is desirable to enable robots capable of automatic assembly. Structural understanding of object parts plays a crucial role in this task yet remains relatively unexplored. In this paper, we focus on the setting of furniture assembly from a complete set of part geometries, which is essentially a 6-DoF part pose estimation problem. We propose a multi-layer transformer-based framework that involves geometric and relational reasoning between parts to update the part poses iteratively. We carefully design a unique instance encoding to solve the ambiguity between geometrically-similar parts so that all parts can be distinguished. In addition to assembling from scratch, we extend our framework to a new task called in-process part assembly. Analogous to furniture maintenance, it requires robots to continue with unfinished products and assemble the remaining parts into appropriate positions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
