"Stack It Up!": 3D Stable Structure Generation from 2D Hand-drawn Sketch
Yiqing Xu, Linfeng Li, Cunjun Yu, David Hsu

TL;DR
StackItUp enables non-experts to generate stable, multi-level 3D structures from simple 2D sketches by using an abstract relation graph and diffusion models, bridging the gap between sketches and 3D models.
Contribution
This paper introduces StackItUp, a novel system that converts 2D hand-drawn sketches into accurate 3D structures using relation graphs and diffusion models, without requiring expert tools.
Findings
Outperforms baselines in stability and visual resemblance
Successfully generates complex 3D structures from sketches
Handles stability-critical supports effectively
Abstract
Imagine a child sketching the Eiffel Tower and asking a robot to bring it to life. Today's robot manipulation systems can't act on such sketches directly-they require precise 3D block poses as goals, which in turn demand structural analysis and expert tools like CAD. We present StackItUp, a system that enables non-experts to specify complex 3D structures using only 2D front-view hand-drawn sketches. StackItUp introduces an abstract relation graph to bridge the gap between rough sketches and accurate 3D block arrangements, capturing the symbolic geometric relations (e.g., left-of) and stability patterns (e.g., two-pillar-bridge) while discarding noisy metric details from sketches. It then grounds this graph to 3D poses using compositional diffusion models and iteratively updates it by predicting hidden internal and rear supports-critical for stability but absent from the sketch.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Interactive and Immersive Displays · Robotics and Sensor-Based Localization
