DeepIaC: Deep Learning-Based Linguistic Anti-pattern Detection in IaC
Nemania Borovits, Indika Kumara, Parvathy Krishnan, Stefano Dalla, Palma, Dario Di Nucci, Fabio Palomba, Damian A. Tamburri, Willem-Jan van den, Heuvel

TL;DR
This paper introduces a deep learning approach to detect linguistic anti-patterns in infrastructure as code scripts, focusing on inconsistencies between code logic and naming to improve code quality.
Contribution
It presents a novel automated method using word embeddings and deep learning to identify linguistic anti-patterns in IaC scripts, leveraging abstract syntax trees for embedding.
Findings
Achieved detection accuracy between 78.5% and 91.5%.
Demonstrated effectiveness on open source IaC repositories.
Improves maintainability by identifying naming inconsistencies.
Abstract
Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we consider inconsistencies between the logic/body of IaC code units and their names. To this end, we propose a novel automated approach that employs word embeddings and deep learning techniques. We build and use the abstract syntax tree of IaC code units to create their code embedments. Our experiments with a dataset systematically extracted from open source repositories show that our approach yields an accuracy between0.785and0.915in detecting inconsistencies
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
