TL;DR
This paper introduces git2net, a scalable tool for extracting detailed co-editing networks from large git repositories, enabling high-resolution analysis of developer collaboration patterns in software engineering.
Contribution
The paper presents git2net, a novel Python tool that analyzes textual code changes to generate fine-grained developer networks from large-scale git data.
Findings
Analyzed 1.2 million commits and 25,000 developers.
Identified patterns linking developer productivity and co-editing.
Demonstrated the tool's applicability in diverse projects.
Abstract
Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts. Because this neglects detailed information on code changes and code ownership we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. We apply our tool in two case studies using GitHub repositories of multiple Open Source as well as a commercial software project. Specifically, we use data on more than 1.2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
