Exploring the Potential of Large Language Models in Simulink-Stateflow Mutant Generation
Pablo Valle, Shaukat Ali, Aitor Arrieta

TL;DR
This paper explores using Large Language Models to generate high-quality mutants for Simulink-Stateflow models, addressing limitations of traditional mutation techniques in safety-critical Cyber-Physical Systems.
Contribution
It introduces an automated pipeline leveraging LLMs for mutant generation, demonstrating significant speed and quality improvements over baseline methods.
Findings
LLMs produce mutants up to 13x faster than manual methods.
LLMs generate fewer equivalent and duplicate mutants.
Few-shot prompting with low-to-medium temperature yields best results.
Abstract
Mutation analysis is a powerful technique for assessing test-suite adequacy, yet conventional approaches suffer from generating redundant, equivalent, or non-executable mutants. These challenges are particularly amplified in Simulink-Stateflow models due to the hierarchical structure these models have, which integrate continuous dynamics with discrete-event behaviors and are widely deployed in safety-critical Cyber-Physical Systems (CPSs). While prior work has explored machine learning and manually engineered mutation operators, these approaches remain constrained by limited training data and scalability issues. Motivated by recent advances in Large Language Models (LLMs), we investigate their potential to generate high-quality, domain-specific mutants for Simulink-Stateflow models. We develop an automated pipeline that converts Simulink-Stateflow models to structured JSON…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Testing and Debugging Techniques · Formal Methods in Verification
