Text-to-CAD Retrieval: a Strong Baseline

Honghu Pan; Zibo Du; Daxiang Liu; Chengliang Liu; Xiaoling Luo

arXiv:2605.05572·cs.CV·May 8, 2026

Text-to-CAD Retrieval: a Strong Baseline

Honghu Pan, Zibo Du, Daxiang Liu, Chengliang Liu, Xiaoling Luo

PDF

TL;DR

This paper introduces a new task for retrieving CAD models using natural language queries, proposing a multi-modal embedding framework and establishing a benchmark with the Text2CAD dataset.

Contribution

It presents a unified multi-modal framework for text-to-CAD retrieval, combining sequence and geometric features, and introduces a benchmark dataset for this task.

Findings

01

Proposed a strong baseline framework for text-to-CAD retrieval.

02

Established a practical benchmark using the Text2CAD dataset.

03

Achieved effective multi-modal alignment for cross-modal retrieval.

Abstract

Text-based retrieval of Computer-Aided Design (CAD) models is a critical yet underexplored task for the reuse of legacy industrial designs. Existing CAD repositories are typically searched using filenames or directories, which limits the efficiency, scalability, and accuracy of design retrieval. In this paper, we formally introduce text-to-CAD retrieval as a new cross-modal retrieval task, aiming to retrieve semantically relevant CAD models from large-scale databases given natural language queries. Leveraging paired text-CAD annotations from the Text2CAD dataset, we establish a practical benchmark for this task. To achieve text-based retrieval, we propose a unified framework that learns multi-modal CAD embeddings from both procedural sequences and geometric point clouds. Specifically, a sequence encoder captures the construction logic of CAD models, while a point encoder extracts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.