A Defect Taxonomy for Infrastructure as Code: A Replication Study
Wendell Oliveira, Filipe Paiva, Thiago Emmanuel Pereira, Jo\~ao Brunet

TL;DR
This study replicates and extends a defect taxonomy for Infrastructure as Code (IaC), confirming its applicability across diverse tools and projects, and highlighting persistent defect types like Configuration Data issues.
Contribution
It validates the generalizability of an existing IaC defect taxonomy across multiple programming languages and project types, and enhances an automated classification tool.
Findings
The same eight defect categories are confirmed across IaC tools.
Configuration Data defects are highly frequent in both open-source and proprietary projects.
Idempotency and security defects are infrequent but consistently present.
Abstract
Background: As Infrastructure as Code (IaC) becomes standard practice, ensuring the reliability of IaC scripts is essential. Defect taxonomies are valuable tools for this, offering a common language for issues and enabling systematic tracking. A significant prior study developed such a taxonomy, but based it exclusively on the declarative language Puppet. It remained unknown whether this taxonomy applies to programming language-based IaC (PL-IaC) tools like Pulumi, Terraform CDK, and AWS CDK. Aim: We replicated this foundational work to assess the generalizability of the taxonomy across a broader and more diverse landscape. Method: We performed qualitative analysis on 3,364 defect-related commits from 285 open-source PL-IaC repositories (PIPr dataset) to derive a PL-IaC-specific defect taxonomy. We then enhanced the ACID tool, originally developed for the prior study, to automatically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Software Engineering Methodologies · Software System Performance and Reliability
