On the effectiveness of Large Language Models in the mechanical design   domain

Daniele Grandi; Fabian Riquelme

arXiv:2505.01559·cs.CL·May 6, 2025

On the effectiveness of Large Language Models in the mechanical design domain

Daniele Grandi, Fabian Riquelme

PDF

Open Access 1 Repo

TL;DR

This paper evaluates large language models' performance in mechanical engineering by using domain-specific datasets and unsupervised tasks, revealing their strengths and limitations in understanding domain terminology.

Contribution

It introduces domain-specific evaluation tasks for LLMs in mechanical design and analyzes their performance, highlighting challenges in applying language models to specialized engineering data.

Findings

01

Achieved 0.62 accuracy in binary classification with fine-tuned models.

02

Zero-shot classification outperforms baselines with 0.386 accuracy.

03

Identified failure modes in language learning within the mechanical domain.

Abstract

In this work, we seek to understand the performance of large language models in the mechanical engineering domain. We leverage the semantic data found in the ABC dataset, specifically the assembly names that designers assigned to the overall assemblies, and the individual semantic part names that were assigned to each part. After pre-processing the data we developed two unsupervised tasks to evaluate how different model architectures perform on domain-specific data: a binary sentence-pair classification task and a zero-shot classification task. We achieved a 0.62 accuracy for the binary sentence-pair classification task with a fine-tuned model that focuses on fighting over-fitting: 1) modifying learning rates, 2) dropout values, 3) Sequence Length, and 4) adding a multi-head attention layer. Our model on the zero-shot classification task outperforms the baselines by a wide margin, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

grndnl/w266_final_project
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Natural Language Processing Techniques · Manufacturing Process and Optimization

MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention · Dropout · Approximate Bayesian Computation