CADFS: A Big CAD Program Dataset and Framework for Computer-Aided Design with Large Language Models

Vladislav Pyatov; Gleb Bobrovskikh; Saveliy Galochkin; Nikita Boldyrev; Oleg Voynov; Alexander Filippov; Gonzalo Ferrer; Peter Wonka; Evgeny Burnaev

arXiv:2605.01925·cs.CV·May 5, 2026

CADFS: A Big CAD Program Dataset and Framework for Computer-Aided Design with Large Language Models

Vladislav Pyatov, Gleb Bobrovskikh, Saveliy Galochkin, Nikita Boldyrev, Oleg Voynov, Alexander Filippov, Gonzalo Ferrer, Peter Wonka, Evgeny Burnaev

PDF

1 Repo

TL;DR

CADFS introduces a large-scale CAD dataset and a FeatureScript-based framework enabling vision-language models to generate complex, realistic CAD designs with improved accuracy and diversity, surpassing prior methods.

Contribution

The paper presents a new dataset of 450k CAD models and a FeatureScript-based representation that enhances generative CAD capabilities with large language models.

Findings

01

State-of-the-art results in text-conditioned CAD generation.

02

More accurate, diverse, and feature-rich CAD designs.

03

Each component significantly improves performance.

Abstract

We introduce CADFS, a data-centric framework that enables large vision-language models to generate complex CAD design histories. Existing generative CAD systems are restricted to sketch-extrude operations due to simplified representations and limited datasets. We address this by introducing a FeatureScript-based representation and constructing a dataset of 450k real-world CAD models spanning 15 modeling operations. We obtain the dataset via a new pipeline that reconstructs clean, executable FeatureScript programs and provides multimodal annotations. Fine-tuning a VLM on this representation yields state-of-the-art results in text-conditioned CAD generation and image-based reconstruction, producing more accurate, diverse, and feature-rich designs than prior frameworks. Ablations show that each individual component of our framework, i.e., the FeatureScript representation, the extended…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://voyleg.github.io/cadfs
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.