Using Large Language Models to Support Automation of Failure Management in CI/CD Pipelines: A Case Study in SAP HANA
Duong Bui, Stefan Grintz, Alexander Berndt, Thomas Bach

TL;DR
This study evaluates the use of large language models to automate failure management in CI/CD pipelines for SAP HANA, demonstrating high accuracy in error localization and solution generation when provided with historical failure data.
Contribution
It shows that LLMs, combined with historical failure data, can effectively automate failure management tasks in industrial CI/CD pipelines, a novel application in this domain.
Findings
92.1% success in generating exact solutions
97.4% accuracy in error localization with domain knowledge
Historical failure data significantly improves system performance
Abstract
CI/CD pipeline failure management is time-consuming when performed manually. Automating this process is non-trivial because the information required for effective failure management is unstructured and cannot be automatically processed by traditional programs. With their ability to process unstructured data, large language models (LLMs) have shown promising results for automated failure management by previous work. Following these studies, we evaluated whether an LLM-based system could automate failure management in a CI/CD pipeline in the context of a large industrial software project, namely SAP HANA. We evaluated the ability of the LLM-based system to identify the error location and to propose exact solutions that contain no unnecessary actions. To support the LLM in generating exact solutions, we provided it with different types of domain knowledge, including pipeline information,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Testing and Debugging Techniques
