Can an LLM Detect Instances of Microservice Infrastructure Patterns?
Carlos Eduardo Duarte, Neil B. Harrison, Filipe Figueiredo Correia, Ademar Aguiar, Pavl\'ina Gon\c{c}alves

TL;DR
This paper evaluates the effectiveness of GPT-based models in detecting microservice architectural patterns across diverse programming languages and artifacts, highlighting variability in detection accuracy based on pattern prevalence and artifact clarity.
Contribution
Introduces MicroPAD, a GPT-5 nano-based tool for cross-language detection of microservice patterns, and provides an empirical assessment of LLMs in architectural pattern recognition.
Findings
Detection performance varies significantly across patterns (F1 scores 0.09 to 0.70).
Patterns linked to dominant artifacts are detected more reliably.
Detection effectiveness depends on pattern prevalence and artifact distinctiveness.
Abstract
Architectural patterns are frequently found in various software artifacts. The wide variety of patterns and their implementations makes detection challenging with current tools, especially since they often only support detecting patterns in artifacts written in a single language. Large Language Models (LLMs), trained on a diverse range of software artifacts and knowledge, might overcome the limitations of existing approaches. However, their true effectiveness and the factors influencing their performance have not yet been thoroughly examined. To better understand this, we developed MicroPAD. This tool utilizes GPT 5 nano to identify architectural patterns in software artifacts written in any language, based on natural-language pattern descriptions. We used MicroPAD to evaluate an LLM's ability to detect instances of architectural patterns, particularly infrastructure-related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Engineering Research · Software Testing and Debugging Techniques
