Underproduction: An Approach for Measuring Risk in Open Source Software

Kaylea Champion; Benjamin Mako Hill

arXiv:2103.00352·cs.SE·December 10, 2024

Underproduction: An Approach for Measuring Risk in Open Source Software

Kaylea Champion, Benjamin Mako Hill

PDF

TL;DR

This paper introduces a novel framework and statistical method to measure and identify underproduction risk in open source software, highlighting widespread issues in Debian's software packages.

Contribution

It presents a new conceptual framework and statistical approach for detecting underproduction risk in FLOSS, validated on Debian's extensive dataset.

Findings

01

Widespread underproduction detected in Debian packages

02

Method effectively identifies at-risk software components

03

Validation confirms the approach's utility

Abstract

The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call 'underproduction' which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs. We draw on this application to present two experiments: (1) a demonstration of how our technique can be used to identify at-risk…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.