Model Input Verification of Large Scale Simulations
Rumyana Neykova, Derek Groen

TL;DR
This paper introduces FabGuard, a data validation tool for large-scale simulations that ensures input correctness, integrates into workflows, and leverages LLMs for automating constraint inference, demonstrated across diverse simulation domains.
Contribution
It presents a formalism and pipeline for model input verification in simulations, incorporating LLMs for automating constraint generation and validation.
Findings
FabGuard efficiently processes 12,000 files in 140 seconds.
LLMs correctly inferred 22 out of 23 constraints in a migration case study.
MIV is feasible for large datasets and diverse simulation domains.
Abstract
Reliable simulations are critical for analyzing and understanding complex systems, but their accuracy depends on correct input data. Incorrect inputs such as invalid or out-of-range values, missing data, and format inconsistencies can cause simulation crashes or unnoticed result distortions, ultimately undermining the validity of the conclusions. This paper presents a methodology for verifying the validity of input data in simulations, a process we term model input verification (MIV). We implement this approach in FabGuard, a toolset that uses established data schema and validation tools for the specific needs of simulation modeling. We introduce a formalism for categorizing MIV patterns and offer a streamlined verification pipeline that integrates into existing simulation workflows. FabGuard's applicability is demonstrated across three diverse domains: conflict-driven migration,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Model-Driven Software Engineering Techniques
