Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset
Andrew Goldberg, Kavish Kondap, Tianshuang Qiu, Zehan Ma, Letian Fu,, Justin Kerr, Huang Huang, Kaiyuan Chen, Kuan Fang, Ken Goldberg

TL;DR
Blox-Net introduces a novel generative system that creates and physically assembles objects from natural language prompts and component images, combining AI, simulation, and robotics with minimal human oversight.
Contribution
The paper presents Blox-Net, a pioneering system for generative design-for-robot-assembly that integrates vision-language models, simulation, and robotics to produce reliable physical assemblies from text and images.
Findings
Achieved 63.5% recognizability accuracy of generated assemblies.
Successfully assembled objects with near-perfect success over multiple iterations.
Performed entire design-to-assembly process with zero human intervention.
Abstract
Generative AI systems have shown impressive capabilities in creating text, code, and images. Inspired by the rich history of research in industrial ''Design for Assembly'', we introduce a novel problem: Generative Design-for-Robot-Assembly (GDfRA). The task is to generate an assembly based on a natural language prompt (e.g., ''giraffe'') and an image of available physical components, such as 3D-printed blocks. The output is an assembly, a spatial arrangement of these components, and instructions for a robot to build this assembly. The output must 1) resemble the requested object and 2) be reliably assembled by a 6 DoF robot arm with a suction gripper. We then present Blox-Net, a GDfRA system that combines generative vision language models with well-established methods in computer vision, simulation, perturbation analysis, motion planning, and physical robot experimentation to solve a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Manufacturing Process and Optimization · BIM and Construction Integration
