GenSIaC: Toward Security-Aware Infrastructure-as-Code Generation with Large Language Models

Yikun Li; Matteo Grella; Daniel Nahmias; Gal Engelberg; Dan Klein; Giancarlo Guizzardi; Thijs van Ede; Andrea Continella

arXiv:2511.12385·cs.CR·November 18, 2025

GenSIaC: Toward Security-Aware Infrastructure-as-Code Generation with Large Language Models

Yikun Li, Matteo Grella, Daniel Nahmias, Gal Engelberg, Dan Klein, Giancarlo Guizzardi, Thijs van Ede, Andrea Continella

PDF

Open Access

TL;DR

This paper explores how Large Language Models can be fine-tuned with a specialized dataset to generate security-aware Infrastructure as Code, significantly reducing security vulnerabilities and misconfigurations.

Contribution

The paper introduces GenSIaC, a fine-tuning dataset that enhances LLMs' ability to recognize and generate secure IaC scripts, addressing security weaknesses in current models.

Findings

01

F1-score for security recognition improved from 0.303 to 0.858

02

Models effectively identify major IaC security weaknesses

03

GenSIaC demonstrates good generalizability across LLMs and languages

Abstract

In recent years, Infrastructure as Code (IaC) has emerged as a critical approach for managing and provisioning IT infrastructure through code and automation. IaC enables organizations to create scalable and consistent environments, effectively managing servers and development settings. However, the growing complexity of cloud infrastructures has led to an increased risk of misconfigurations and security vulnerabilities in IaC scripts. To address this problem, this paper investigates the potential of Large Language Models (LLMs) in generating security-aware IaC code, avoiding misconfigurations introduced by developers and administrators. While LLMs have made significant progress in natural language processing and code generation, their ability to generate secure IaC scripts remains unclear. This paper addresses two major problems: 1) the lack of understanding of security weaknesses in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Information and Cyber Security · Security and Verification in Computing