Empirical Growing Networks vs Minimal Models: Evidence and Challenges from Software Heritage and APS Citation Datasets
Guillaume Rousseau

TL;DR
This paper analyzes large-scale software and citation networks to understand their growth dynamics, revealing regime shifts and challenges in confirming scale-free properties, and highlights the need for refined models and analysis tools.
Contribution
It introduces a detailed temporal analysis of empirical networks, exposing structural transitions and assessing the applicability of minimal models to real-world data.
Findings
Software Heritage network exhibits regime shifts linked to developer practices
Citation network shows a major growth regime change after 1985
Estimations of scale-free exponents are sensitive to regime shifts and outliers
Abstract
We investigate the evolution rules and degree distribution properties of the Software Heritage dataset, a large-scale growing network linking software source-code versions from open-source communities. The network spans more than 40 years and includes about 6 billion nodes and edges. Our analysis relies on deterministic temporal and topological partitions of nodes and edges, which account for the multilayer and partially timestamped structure of the main graph. We derive a temporal graph that reveals a mesoscale structure and enables the study of edge dynamics--creation, inheritance, and aging--together with comparisons to minimal models using degree distributions and histograms of edge timestamp differences. The temporal graph also exposes regime shifts that correlate with changes in developer practices, as reflected in the average number of edges per new node. We estimate scaling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGreenhouse Technology and Climate Control · Interconnection Networks and Systems · VLSI and FPGA Design Techniques
