Text-to-CAD Retrieval: a Strong Baseline
Honghu Pan, Zibo Du, Daxiang Liu, Chengliang Liu, Xiaoling Luo

TL;DR
This paper introduces a new task for retrieving CAD models using natural language queries, proposing a multi-modal embedding framework and establishing a benchmark with the Text2CAD dataset.
Contribution
It presents a unified multi-modal framework for text-to-CAD retrieval, combining sequence and geometric features, and introduces a benchmark dataset for this task.
Findings
Proposed a strong baseline framework for text-to-CAD retrieval.
Established a practical benchmark using the Text2CAD dataset.
Achieved effective multi-modal alignment for cross-modal retrieval.
Abstract
Text-based retrieval of Computer-Aided Design (CAD) models is a critical yet underexplored task for the reuse of legacy industrial designs. Existing CAD repositories are typically searched using filenames or directories, which limits the efficiency, scalability, and accuracy of design retrieval. In this paper, we formally introduce text-to-CAD retrieval as a new cross-modal retrieval task, aiming to retrieve semantically relevant CAD models from large-scale databases given natural language queries. Leveraging paired text-CAD annotations from the Text2CAD dataset, we establish a practical benchmark for this task. To achieve text-based retrieval, we propose a unified framework that learns multi-modal CAD embeddings from both procedural sequences and geometric point clouds. Specifically, a sequence encoder captures the construction logic of CAD models, while a point encoder extracts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
