Early-Stage Product Line Validation Using LLMs: A Study on Semi-Formal Blueprint Analysis
Viet-Man Le, Thi Ngoc Trang Tran, Sebastian Lubos, Alexander Felfernig, Damian Garber

TL;DR
This study evaluates the effectiveness of large language models in analyzing semi-formal feature blueprints for early validation in software product line engineering, showing promising accuracy levels.
Contribution
It demonstrates that reasoning-optimized LLMs can perform feature model analysis operations with high accuracy, approaching solver-based methods, for early product line validation.
Findings
Grok 4 Fast Reasoning and Gemini 2.5 Pro achieve 88-89% accuracy.
Systematic errors are identified in structural parsing and constraint reasoning.
Trade-offs between accuracy and computational cost are highlighted for model selection.
Abstract
We study whether Large Language Models (LLMs) can perform feature model analysis operations (AOs) directly on semi-formal textual blueprints, i.e., concise constrained-language descriptions of feature hierarchies and constraints, enabling early validation in Software Product Line scoping. Using 12 state-of-the-art LLMs and 16 standard AOs, we compare their outputs against the solver-based oracle FLAMA. Results show that reasoning-optimized models (e.g., Grok 4 Fast Reasoning, Gemini 2.5 Pro) achieve 88-89% average accuracy across all evaluated blueprints and operations, approaching solver correctness. We identify systematic errors in structural parsing and constraint reasoning, and highlight accuracy-cost trade-offs that inform model selection. These findings position LLMs as lightweight assistants for early variability validation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
