CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation
Anna C. Doris, Md Ferdous Alam, Amin Heyrani Nobari, Faez Ahmed

TL;DR
CAD-Coder is an open-source vision-language model that generates editable CAD code from images, outperforming existing models and demonstrating promising generalization to real-world images, thus streamlining engineering design workflows.
Contribution
Introduces CAD-Coder, a fine-tuned vision-language model for CAD code generation with a new large dataset, achieving superior accuracy and generalization over prior models.
Findings
Achieves 100% valid syntax rate in CAD code generation.
Outperforms state-of-the-art models like GPT-4.5 and Qwen2.5-VL-72B.
Shows potential to generate CAD code from real-world images.
Abstract
Efficient creation of accurate and editable 3D CAD models is critical in engineering design, significantly impacting cost and time-to-market in product innovation. Current manual workflows remain highly time-consuming and demand extensive user expertise. While recent developments in AI-driven CAD generation show promise, existing models are limited by incomplete representations of CAD operations, inability to generalize to real-world images, and low output accuracy. This paper introduces CAD-Coder, an open-source Vision-Language Model (VLM) explicitly fine-tuned to generate editable CAD code (CadQuery Python) directly from visual input. Leveraging a novel dataset that we created--GenCAD-Code, consisting of over 163k CAD-model image and code pairs--CAD-Coder outperforms state-of-the-art VLM baselines such as GPT-4.5 and Qwen2.5-VL-72B, achieving a 100% valid syntax rate and the highest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection · Byte Pair Encoding
