On the effectiveness of Large Language Models in the mechanical design domain
Daniele Grandi, Fabian Riquelme

TL;DR
This paper evaluates large language models' performance in mechanical engineering by using domain-specific datasets and unsupervised tasks, revealing their strengths and limitations in understanding domain terminology.
Contribution
It introduces domain-specific evaluation tasks for LLMs in mechanical design and analyzes their performance, highlighting challenges in applying language models to specialized engineering data.
Findings
Achieved 0.62 accuracy in binary classification with fine-tuned models.
Zero-shot classification outperforms baselines with 0.386 accuracy.
Identified failure modes in language learning within the mechanical domain.
Abstract
In this work, we seek to understand the performance of large language models in the mechanical engineering domain. We leverage the semantic data found in the ABC dataset, specifically the assembly names that designers assigned to the overall assemblies, and the individual semantic part names that were assigned to each part. After pre-processing the data we developed two unsupervised tasks to evaluate how different model architectures perform on domain-specific data: a binary sentence-pair classification task and a zero-shot classification task. We achieved a 0.62 accuracy for the binary sentence-pair classification task with a fine-tuned model that focuses on fighting over-fitting: 1) modifying learning rates, 2) dropout values, 3) Sequence Length, and 4) adding a multi-head attention layer. Our model on the zero-shot classification task outperforms the baselines by a wide margin, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Natural Language Processing Techniques · Manufacturing Process and Optimization
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention · Dropout · Approximate Bayesian Computation
