# Credit risk prediction model for listed companies based on improved reinforcement learning and Bayesian optimization hyperband

**Authors:** Cai Yuanqing, Zhenming Gao, Zhang Jian, Roohallah Alizadehsani, Paweł Pławiak

PMC · DOI: 10.1371/journal.pone.0332150 · PLOS One · 2025-10-28

## TL;DR

This paper introduces a new credit risk prediction model for listed companies using improved reinforcement learning and Bayesian optimization, outperforming existing methods.

## Contribution

The novel approach combines off-policy PPO for feature selection and imbalanced classification with BOHB for hyperparameter optimization.

## Key findings

- The model achieved F-measures of 90.763% to 89.485% across multiple datasets.
- The method outperforms state-of-the-art models in credit risk prediction.
- The approach improves sample efficiency and handles imbalanced data effectively.

## Abstract

The financial sector has experienced swift growth over recent years, leading to the escalating prominence of credit risk among publicly traded companies. Consequently, forecasting credit risk for these firms has emerged as a critical task for banks, regulatory bodies, and investors. Traditional models include the z-score, the logit (logistic regression model), the kernel-based virtual machine (KVM), and neural network approaches. Nevertheless, the outcomes from these methods have often fallen short of expectations. Three major challenges in previous works are feature selection, imbalanced classification, and hyperparameter optimization. This paper presents a method for credit risk prediction for listed companies that uses an off-policy proximal policy optimization (PPO) algorithm for feature selection and imbalanced classification. The off-policy PPO, a reinforcement learning (RL) approach, enhances sample efficiency by more effectively utilizing past experiences during policy updates. This approach improves feature selection and the management of imbalanced classification by optimizing data use, thereby enhancing model training outcomes. Moreover, we use the Bayesian optimization hyperband (BOHB) approach to refine the hyperparameters of the method. BOHB merges Bayesian optimization and Hyperband, significantly speeding up the optimization process. We assess our model using the China Stock Market and Accounting Research (CSMAR), MorningStar, KMV default, Give Me Some Credit (GMSC), and the University of California, Irvine Credit Card Default (UCICCD) datasets. Our experimental findings demonstrate the excellence of the model over existing state-of-the-art models, achieving F-measures of 90.763%, 86.358%, 87.047%, 90.576%, and 89.485% on these datasets. These findings validate the efficiency of the method in economic settings, signifying a major progression in systems for predicting credit risk and enhancing investigative approaches.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** SCF (MESH:D007161), GMSC (MESH:D009369), SMEs (MESH:D015875), heart disease (MESH:D006331), DL (MESH:D007859)
- **Chemicals:** BOHB (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** DAMEL — Homo sapiens (Human), Primary effusion lymphoma, Cancer cell line (CVCL_0165)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12561926/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12561926/full.md

## References

64 references — full list in the complete paper: https://tomesphere.com/paper/PMC12561926/full.md

---
Source: https://tomesphere.com/paper/PMC12561926