From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach
Xilin Wang, Jia Zheng, Yuanchao Hu, Hao Zhu, Qian Yu and, Zihan Zhou

TL;DR
This paper introduces CAD2Program, a vision-language model that reconstructs 3D parametric models from 2D CAD drawings by encoding images with ViT and generating descriptive text, offering flexibility and competitive performance.
Contribution
The paper presents a novel vision-language approach for 3D model reconstruction from 2D CAD drawings, using image encoding and text generation, departing from traditional vector-based methods.
Findings
Achieves competitive performance with fewer restrictions on input drawings.
Uses a flexible text-based representation for 3D models.
Demonstrates effectiveness on a large-scale cabinet dataset.
Abstract
In this paper, we present CAD2Program, a new method for reconstructing 3D parametric models from 2D CAD drawings. Our proposed method is inspired by recent successes in vision-language models (VLMs), and departs from traditional methods which rely on task-specific data representations and/or algorithms. Specifically, on the input side, we simply treat the 2D CAD drawing as a raster image, regardless of its original format, and encode the image with a standard ViT model. We show that such an encoding scheme achieves competitive performance against existing methods that operate on vector-graphics inputs, while imposing substantially fewer restrictions on the 2D drawings. On the output side, our method auto-regressively predicts a general-purpose language describing 3D parametric models in text form. Compared to other sequence modeling methods for CAD which use domain-specific sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topics3D Surveying and Cultural Heritage · BIM and Construction Integration
MethodsContext Aggregated Bi-lateral Network for Semantic Segmentation
