Automated DevOps Pipeline Generation for Code Repositories using Large Language Models
Deep Mehta, Kartik Rawool, Subodh Gujar, Bowen Xu

TL;DR
This paper explores using GPT 3.5 and GPT 4 to automatically generate and evaluate GitHub Action workflows, enhancing DevOps automation with a new GitHub App and novel evaluation metrics.
Contribution
It introduces a methodology for leveraging large language models to generate DevOps workflows and evaluates their effectiveness, especially highlighting GPT 4's improvements.
Findings
GPT 4 outperforms GPT 3.5 in syntax correctness and DevOps awareness
The study develops a new DevOps Aware score for workflow evaluation
A GitHub App is created to facilitate automated workflow generation
Abstract
Automating software development processes through the orchestration of GitHub Action workflows has revolutionized the efficiency and agility of software delivery pipelines. This paper presents a detailed investigation into the use of Large Language Models (LLMs) specifically, GPT 3.5 and GPT 4 to generate and evaluate GitHub Action workflows for DevOps tasks. Our methodology involves data collection from public GitHub repositories, prompt engineering for LLM utilization, and evaluation metrics encompassing exact match scores, BLEU scores, and a novel DevOps Aware score. The research scrutinizes the proficiency of GPT 3.5 and GPT 4 in generating GitHub workflows, while assessing the influence of various prompt elements in constructing the most efficient pipeline. Results indicate substantial advancements in GPT 4, particularly in DevOps awareness and syntax correctness. The research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Software Engineering Research · Software System Performance and Reliability
