# Software technical debt prediction based on complex software networks

**Authors:** Bo Jiang, Jiaye Cen, Erluan Zhu, Jiale Wang

PMC · DOI: 10.1371/journal.pone.0323672 · 2025-06-09

## TL;DR

This paper introduces a new way to predict software technical debt by combining network analysis metrics with traditional metrics, improving model performance.

## Contribution

The novel contribution is combining Social Network Analysis metrics with traditional TD metrics to enhance TDP model performance.

## Key findings

- The combined metric suite improves the performance of technical debt prediction models.
- XGBoost classifier achieves the best performance with high recall and F2 score.
- Different metric combinations show varying effectiveness in TDP.

## Abstract

Technical debt prediction (TDP) is crucial for the long-term maintainability of software. In the literature, many machine-learning based TDP models have been proposed; they used TD-related metrics as input features for machine-learning classifiers to build TDP models. However, their performance is unsatisfactory. Developing and utilizing more effective metrics to build TDP models is considered as a promising approach to enhance the performance of TDP models. Social Network Analysis (SNA) uses a set of metrics (i.e., SNA metrics) to characterize software elements (classes, binaries, etc.) in software from the perspective of software as a whole. SNA metrics are regarded as a compensation of TD-related metrics used in the existing TDP work, and thus are expected to improve the performance of existing TDP models. However, the effectiveness of SNA metrics in the field of TDP has never been explored so far. To fill this gap, in this paper, we propose an improved software technical debt prediction approach. First, we represent software as a Class Dependency Network, based on which we compute the value of a set of SNA metrics. Second, we combine SNA metrics with the TD-related metrics to create a combined metric suite (CMS). Third, we employ CMS as the input features and utilize seven commonly used machine learning classifiers to build TDP models. Empirical results on a publicly available data set show that (i) the combined metric suite (i.e., CMS) can indeed improve the performance of existing TDP models; (ii) XGBoost performs best among the seven classifiers, with an F2 value of 0.77, an MI ratio of approximately 0.10, and a recall close to 0.87. Furthermore, we also reveal the relative effectiveness of different metric combinations.

## Full-text entities

- **Diseases:** TD (MESH:D004409)

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12148173/full.md

---
Source: https://tomesphere.com/paper/PMC12148173