Can ChatGPT Generate Realistic Synthetic System Requirement Specifications? Results of a Case Study
Alex R. Mattukat, Florian M. Braun, Horst Lichter

TL;DR
This study explores ChatGPT's ability to generate realistic synthetic system requirement specifications (SyRSs) across industries, highlighting both its potential and limitations through systematic evaluation and expert feedback.
Contribution
The paper demonstrates a systematic approach for generating and assessing synthetic SyRSs with ChatGPT, revealing insights into their realism and the challenges involved.
Findings
62% of experts considered the SSyRSs to be realistic
Generated 300 SSyRSs across 10 industries using prompt patterns
LLM-based assessments cannot fully replace expert evaluations
Abstract
System requirement specifications (SyRSs) are central, natural-language (NL) artifacts. Access to real SyRS for research purposes is highly valuable but limited by proprietary restrictions or confidentiality concerns. Generating synthetic SyRSs (SSyRSs) can address this scarcity. Black-box large language models (LLMs) such as ChatGPT offer compelling generation capabilities by providing easy access to NL generation functions without requiring access to real data. However, LLMs suffer from hallucinations and overconfidence, which pose major challenges in their use. We designed an exploratory study to investigate whether, despite these challenges, we can generate realistic SSyRSs with ChatGPT without having access to real SyRSs. Using a systematic approach that leverages prompt patterns, LLM-based quality assessments, and iterative prompt refinements, we generated 300 SSyRSs across 10…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
