On the effectiveness of Large Language Models for GitHub Workflows
Xinyu Zhang, Siddharth Muralee, Sourag Cherupattamoolayil, and Aravind, Machiry

TL;DR
This study evaluates the effectiveness of Large Language Models in automating, generating, and securing GitHub workflows, highlighting their capabilities and limitations in this specialized domain.
Contribution
It is the first comprehensive analysis of LLMs applied to GitHub workflows, including a large dataset and various prompt strategies.
Findings
LLMs can assist in generating GitHub workflows with varying success.
Fine-tuning improves LLM performance on workflow tasks.
Current LLMs have notable limitations in understanding complex workflow semantics.
Abstract
GitHub workflows or GitHub CI is a popular continuous integration platform that enables developers to automate various software engineering tasks by specifying them as workflows, i.e., YAML files with a list of jobs. However, engineering valid workflows is tedious. They are also prone to severe security issues, which can result in supply chain vulnerabilities. Recent advancements in Large Language Models (LLMs) have demonstrated their effectiveness in various software development tasks. However, GitHub workflows differ from regular programs in both structure and semantics. We perform the first comprehensive study to understand the effectiveness of LLMs on five workflow-related tasks with different levels of prompts. We curated a set of 400K workflows and generated prompts with varying detail. We also fine-tuned LLMs on GitHub workflow tasks. Our evaluation of three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Business Process Modeling and Analysis
