Detecting Semantic Alignments between Textual Specifications and Domain Models
Shwetali Shimangaud, Lola Burgue\~no, Rijul Saini, J\"org Kienzle

TL;DR
This paper introduces an NLP and LLM-based approach to automatically assess the alignment between textual specifications and domain models, aiding early software engineering tasks.
Contribution
It presents a novel method combining NLP preprocessing and large language models to classify model elements as aligned, misaligned, or unclassified, with high precision and practical execution times.
Findings
Achieves near 1 precision in identifying alignments and misalignments.
Recovers approximately 78% of model elements correctly.
Execution time ranges from 18 seconds to 1 minute per element.
Abstract
Context: Having domain models derived from textual specifications has proven to be very useful in the early phases of software engineering. However, creating correct domain models and establishing clear links with the textual specification is a challenging task, especially for novice modelers. Objectives: We propose an approach for determining the alignment between a partial domain model and a textual specification. Methods: To this aim, we use Natural Language Processing techniques to pre-process the text, generate an artificial natural language specification for each model element, and then use an LLM to compare the generated description with matched sentences from the original specification. Ultimately, our algorithm classifies each model element as either aligned (i.e., correct), misaligned (i.e., incorrect), or unclassified (i.e., insufficient evidence). Furthermore, it outputs the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software Engineering Research · Topic Modeling
