Manual-PA: Learning 3D Part Assembly from Instruction Diagrams

Jiahao Zhang; Anoop Cherian; Cristian Rodriguez; Weijian Deng; Stephen Gould

arXiv:2411.18011·cs.CV·December 2, 2025

Manual-PA: Learning 3D Part Assembly from Instruction Diagrams

Jiahao Zhang, Anoop Cherian, Cristian Rodriguez, Weijian Deng, Stephen Gould

PDF

Open Access

TL;DR

Manual-PA leverages instruction diagrams and transformer models to improve 3D furniture part assembly by predicting assembly order and part poses, showing significant performance gains and real-world generalization.

Contribution

The paper introduces a novel transformer-based framework that uses diagram cues to split assembly into discrete and continuous phases, enhancing learning efficiency.

Findings

01

Significant improvement over state-of-the-art methods.

02

Strong generalization to real-world IKEA furniture assembly.

03

Effective use of diagram cues for assembly prediction.

Abstract

Assembling furniture amounts to solving the discrete-continuous optimization task of selecting the furniture parts to assemble and estimating their connecting poses in a physically realistic manner. The problem is hampered by its combinatorially large yet sparse solution space thus making learning to assemble a challenging task for current machine learning models. In this paper, we attempt to solve this task by leveraging the assembly instructions provided in diagrammatic manuals that typically accompany the furniture parts. Our key insight is to use the cues in these diagrams to split the problem into discrete and continuous phases. Specifically, we present Manual-PA, a transformer-based instruction Manual-guided 3D Part Assembly framework that learns to semantically align 3D parts with their illustrations in the manuals using a contrastive learning backbone towards predicting the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsManufacturing Process and Optimization · Image Processing and 3D Reconstruction · Robot Manipulation and Learning

MethodsContrastive Learning · ALIGN