Is Vibe Coding the Future? An Empirical Assessment of LLM Generated Codes for Construction Safety

S M Jamil Uddin

arXiv:2604.12311·cs.SE·April 28, 2026

Is Vibe Coding the Future? An Empirical Assessment of LLM Generated Codes for Construction Safety

S M Jamil Uddin

PDF

TL;DR

This study empirically assesses the reliability and safety fidelity of LLM-generated code for construction safety, revealing significant risks of silent failures and hallucinations that compromise safety logic.

Contribution

It provides a comprehensive evaluation of LLM-generated construction safety code, highlighting the limitations and risks of current models for critical safety applications.

Findings

01

Approximately 45% silent failure rate in generated scripts

02

GPT-4o-Mini produced mathematically inaccurate outputs in 56% of functional code

03

Less formal prompts increase hallucination and missing safety variables

Abstract

The emergence of vibe coding, a paradigm where non-technical users instruct Large Language Models (LLMs) to generate executable codes via natural language, presents both significant opportunities and severe risks for the construction industry. While empowering construction personnel such as the safety managers, foremen, and workers to develop tools and software, the probabilistic nature of LLMs introduces the threat of silent failures, wherein generated code compiles perfectly but executes flawed mathematical safety logic. This study empirically evaluates the reliability, software architecture, and domain-specific safety fidelity of 450 vibe-coded Python scripts generated by three frontier models, Claude 3.5 Haiku, GPT-4o-Mini, and Gemini 2.5 Flash. Utilizing a persona-driven prompt dataset (n=150) and a bifurcated evaluation pipeline comprising isolated dynamic sandboxing and an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.