A Large Language Model-Empowered Agent for Reliable and Robust Structural Analysis

Jiachen Liu; Ziheng Geng; Ran Cao; Lu Cheng; Paolo Bocchini; and Minghui Cheng

arXiv:2507.02938·cs.CL·July 8, 2025

A Large Language Model-Empowered Agent for Reliable and Robust Structural Analysis

Jiachen Liu, Ziheng Geng, Ran Cao, Lu Cheng, Paolo Bocchini, and Minghui Cheng

PDF

TL;DR

This paper evaluates the capabilities of large language models in structural analysis of beams, identifies their limitations in reliability and robustness, and introduces an LLM-powered agent that significantly improves accuracy by reframing analysis as code generation.

Contribution

It proposes a novel approach that transforms structural analysis into code generation tasks, leading to a highly accurate and robust LLM-based analysis agent.

Findings

01

LLMs lack quantitative reliability and robustness in structural engineering tasks.

02

Reframing analysis as code generation improves accuracy to over 99%.

03

The agent performs reliably across diverse load and boundary conditions.

Abstract

Large language models (LLMs) have exhibited remarkable capabilities across diverse open-domain tasks, yet their application in specialized domains such as civil engineering remains largely unexplored. This paper starts bridging this gap by evaluating and enhancing the reliability and robustness of LLMs in structural analysis of beams. Reliability is assessed through the accuracy of correct outputs under repetitive runs of the same problems, whereas robustness is evaluated via the performance across varying load and boundary conditions. A benchmark dataset, comprising eight beam analysis problems, is created to test the Llama-3.3 70B Instruct model. Results show that, despite a qualitative understanding of structural mechanics, the LLM lacks the quantitative reliability and robustness for engineering applications. To address these limitations, a shift is proposed that reframes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.