Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection
Xinghao Zhu, Devesh K. Jha, Diego Romeres, Lingfeng Sun, Masayoshi, Tomizuka, Anoop Cherian

TL;DR
This paper introduces a comprehensive multi-level framework for robotic object assembly, combining sequence inference, motion planning, and contact optimization, trained on a large dataset, achieving better generalization and efficiency.
Contribution
The work presents PAST, a novel sequence-to-sequence neural network for assembly sequence inference, and D4PAS, a large dataset for training such models, advancing robotic assembly planning.
Findings
Outperforms prior methods in generalization.
Requires less computational time for inference.
Successfully infers assembly sequences from blueprints.
Abstract
Automating the assembly of objects from their parts is a complex problem with innumerable applications in manufacturing, maintenance, and recycling. Unlike existing research, which is limited to target segmentation, pose regression, or using fixed target blueprints, our work presents a holistic multi-level framework for part assembly planning consisting of part assembly sequence inference, part motion planning, and robot contact optimization. We present the Part Assembly Sequence Transformer (PAST) -- a sequence-to-sequence neural network -- to infer assembly sequences recursively from a target blueprint. We then use a motion planner and optimization to generate part movements and contacts. To train PAST, we introduce D4PAS: a large-scale Dataset for Part Assembly Sequences (D4PAS) consisting of physically valid sequences for industrial objects. Experimental results show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Manufacturing Process and Optimization · Image Processing and 3D Reconstruction
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Adam · Byte Pair Encoding · Layer Normalization
