PLLM: Pseudo-Labeling Large Language Models for CAD Program Synthesis
Yuanbo Li, Dule Shu, Yanying Chen, Matt Klenk, Daniel Ritchie

TL;DR
PLLM is a self-training framework that enables large language models to synthesize CAD programs from unlabeled 3D shapes by iteratively generating and refining candidate programs, improving fidelity and diversity.
Contribution
It introduces a novel self-training method for CAD program synthesis that does not require paired shape-program data, leveraging unlabeled datasets for fine-tuning LLMs.
Findings
Consistent improvements in geometric fidelity.
Enhanced program diversity.
Effective adaptation of CAD-Recode to unlabeled datasets.
Abstract
Recovering Computer-Aided Design (CAD) programs from 3D geometries is a widely studied problem. Recent advances in large language models (LLMs) have enabled progress in CAD program synthesis, but existing methods rely on supervised training with paired shape-program data, which is often unavailable. We introduce PLLM, a self-training framework for CAD program synthesis from unlabeled 3D shapes. Given a pre-trained CAD-capable LLM and a shape dataset, PLLM iteratively samples candidate programs, selects high-fidelity executions, and augments programs to construct synthetic program-shape pairs for fine-tuning. We experiment on adapting CAD-Recode from DeepCAD to the unlabeled ABC dataset show consistent improvements in geometric fidelity and program diversity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · 3D Shape Modeling and Analysis · Machine Learning in Materials Science
