LLM-based Generation of Semantically Diverse and Realistic Domain Model Instances

Andrei Coman; Lola Burgue\~no; Dominik Bork; Manuel Wimmer

arXiv:2604.10350·cs.SE·April 14, 2026

LLM-based Generation of Semantically Diverse and Realistic Domain Model Instances

Andrei Coman, Lola Burgue\~no, Dominik Bork, Manuel Wimmer

PDF

TL;DR

This paper introduces a method using Large Language Models and specific prompting strategies to generate semantically realistic and diverse domain model instances, enhancing human understanding and research utility.

Contribution

The approach combines LLMs, prompting strategies, and validation tools to produce semantically coherent and diverse UML class diagram instances, addressing key challenges in domain modeling.

Findings

01

Generated instances are mostly syntactically correct.

02

Instances conform to the domain models with few semantic errors.

03

Values in generated models are semantically diverse and coherent.

Abstract

Large Language Models (LLMs) have been recently proposed for supporting domain modeling tasks mostly related to the completion of partial models by recommending additional model elements. However, there are many more modeling tasks, one of them being the instantiation of domain models to represent concrete domain objects. While there is considerable work supporting the generation of structurally valid instantiations, there are still open challenges to incorporating real-world semantics by having realistic values contained in instances and ensuring the generation of semantically diverse models. Only then will such generated models become human-understandable and helpful in educational or data-driven research contexts. To tackle these challenges, this paper presents an approach that employs LLMs and two prompting strategies in combination with existing model validation tools for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.