# Patterns of Effort Contribution and Demand and User Classification based   on Participation Patterns in NPM Ecosystem

**Authors:** Tapajit Dey, Yuxing Ma, Audris Mockus

arXiv: 1907.06538 · 2019-07-16

## TL;DR

This study analyzes effort contribution and demand patterns among NPM ecosystem users, identifying distinct user groups and predicting company affiliation, revealing that most effort is focused on directly depended packages with limited transitive dependency involvement.

## Contribution

It introduces a comprehensive analysis of effort patterns in the NPM ecosystem, identifying user groups and demonstrating the feasibility of predicting company affiliation from participation data.

## Key findings

- Most effort is directed towards directly depended packages.
- Significant demand occurs outside users' supply chains.
- User groups are distinguishable based on effort patterns.

## Abstract

Background: Open source requires participation of volunteer and commercial developers (users) in order to deliver functional high-quality components. Developers both contribute effort in the form of patches and demand effort from the component maintainers to resolve issues reported against it. Aim: Identify and characterize patterns of effort contribution and demand throughout the open source supply chain and investigate if and how these patterns vary with developer activity; identify different groups of developers; and predict developers' company affiliation based on their participation patterns. Method: 1,376,946 issues and pull-requests created for 4433 NPM packages with over 10,000 monthly downloads and full (public) commit activity data of the 272,142 issue creators is obtained and analyzed and dependencies on NPM packages are identified. Fuzzy c-means clustering algorithm is used to find the groups among the users based on their effort contribution and demand patterns, and Random Forest is used as the predictive modeling technique to identify their company affiliations. Result: Users contribute and demand effort primarily from packages that they depend on directly with only a tiny fraction of contributions and demand going to transitive dependencies. A significant portion of demand goes into packages outside the users' respective supply chains (constructed based on publicly visible version control data). Three and two different groups of users are observed based on the effort demand and effort contribution patterns respectively. The Random Forest model used for identifying the company affiliation of the users gives a AUC-ROC value of 0.68. Conclusion: Our results give new insights into effort demand and supply at different parts of the supply chain of the NPM ecosystem and its users and suggests the need to increase visibility further upstream.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06538/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06538/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1907.06538/full.md

---
Source: https://tomesphere.com/paper/1907.06538