Using Large Language Models for Black-Box Testing of FMU-Based Simulations

Abdullah Mughees; Gaadha Sudheerbabu; Tanwir Ahmad; Dragos Truscan; Mikael Manng{\aa}rd; Kristian Klemets

arXiv:2604.25650·cs.SE·April 29, 2026

Using Large Language Models for Black-Box Testing of FMU-Based Simulations

Abdullah Mughees, Gaadha Sudheerbabu, Tanwir Ahmad, Dragos Truscan, Mikael Manng{\aa}rd, Kristian Klemets

PDF

TL;DR

This paper introduces a human-in-the-loop method leveraging Large Language Models to automate black-box testing of FMU-based simulations, reducing manual effort and enhancing interpretability.

Contribution

It presents a novel approach that uses LLMs to generate structured test scenarios and evaluate simulation outputs for FMUs, improving test automation.

Findings

01

LLM-assisted scenario generation facilitates automatic test design.

02

The approach produces human-readable logs and plots for analysis.

03

Evaluation on a Lube Oil Cooling system demonstrates practical effectiveness.

Abstract

We propose a human in the loop approach for black-box testing of Functional Mock-up Units (FMUs) using Large Language Models (LLMs). The goal is to reduce the manual effort in defining test scenarios for dynamic simulation models and to improve the interpretability of results. The approach takes the functional and interface specifications of an FMU as input, and prompts an LLM to generate structured scenario goals in Given-When-Then format that define the initial input conditions of the simulation, a possible change in those conditions, and the expected output behaviour of the system against those changes. The corresponding scenario plans specify input patterns and add assertion oracles that describe expected output patterns defined in scenario goals. The approach generates a complete input time series for the scenario plans, runs the FMU simulation, and evaluates assertions on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.