# Extracting Build Changes with BUILDDIFF

**Authors:** Christian Macho, Shane McIntosh, Martin Pinzger

arXiv: 1703.08527 · 2017-03-27

## TL;DR

BUILDDIFF is a novel approach that accurately extracts and classifies detailed build changes from Maven build files, enabling better understanding and management of build evolution in software projects.

## Contribution

It introduces a method to extract and classify 95 types of build changes with high precision and recall, and provides insights into build change patterns in open source Java projects.

## Key findings

- Top 10 build change types account for 73% of changes
- Changes to version numbers and dependencies are most frequent
- Build changes occur frequently around releases

## Abstract

Build systems are an essential part of modern software engineering projects. As software projects change continuously, it is crucial to understand how the build system changes because neglecting its maintenance can lead to expensive build breakage. Recent studies have investigated the (co-)evolution of build configurations and reasons for build breakage, but they did this only on a coarse grained level. In this paper, we present BUILDDIFF, an approach to extract detailed build changes from MAVEN build files and classify them into 95 change types. In a manual evaluation of 400 build changing commits, we show that BUILDDIFF can extract and classify build changes with an average precision and recall of 0.96 and 0.98, respectively. We then present two studies using the build changes extracted from 30 open source Java projects to study the frequency and time of build changes. The results show that the top 10 most frequent change types account for 73% of the build changes. Among them, changes to version numbers and changes to dependencies of the projects occur most frequently. Furthermore, our results show that build changes occur frequently around releases. With these results, we provide the basis for further research, such as for analyzing the (co-)evolution of build files with other artifacts or improving effort estimation approaches. Furthermore, our detailed change information enables improvements of refactoring approaches for build configurations and improvements of models to identify error-prone build files.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.08527/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1703.08527/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1703.08527/full.md

---
Source: https://tomesphere.com/paper/1703.08527