# Assessing training needs and influencing factors among personnel at centers for disease control and prevention in northeast China: a cross-sectional study framed by SDT and TPB using machine learning techniques

**Authors:** Kexin Wang, Peng Wang, Min Wei, Yanping Wang, Huan Liu, Ruiqian Zhuge, Qunkai Wang, Nan Meng, Yiran Gao, Yuxuan Wang, Lijun Gao, Jingjing Liu, Xin Zhang, Mingli Jiao, Qunhong Wu

PMC · DOI: 10.1186/s12889-025-23393-w · BMC Public Health · 2025-06-10

## TL;DR

This study uses machine learning to identify training needs and influencing factors among public health workers in northeast China, focusing on improving training effectiveness.

## Contribution

The study introduces a novel integration of SDT, TPB, and machine learning to analyze training needs and influencing factors in public health personnel.

## Key findings

- Four competency subgroups were identified among public health personnel.
- XGBoost outperformed other models in predicting training needs with an AUC of 0.702.
- On-job training satisfaction and intrinsic motivation were key factors influencing training needs.

## Abstract

Training public health personnel is crucial for enhancing the capacity of public health systems. However, existing research often falls short in providing a comprehensive theoretical framework and fails to account for the intricate interplay of multi-dimensional factors in public health. This study aims to identify knowledge and skill gaps at both individual and organizational levels, and to explore multi-dimensional factors influencing training needs within the theoretical frameworks of the Theory of Planned Behavior and Self-Determination Theory.

This cross-sectional study used stratified cluster sampling to conduct an online survey among personnel at the Centers for Disease Control and Prevention from Heilongjiang Province, Jilin Province, Liaoning Province, and Inner Mongolia Autonomous Regions during May 2023. A total of 11,912 valid questionnaires were collected. Latent Class Analysis was used to analyze competency subgroups covering professional abilities, general abilities, and management abilities. Boruta algorithm was used to select feature and improve the performance of the following predictive models. Logistic regression, random forest, least absolute shrinkage and selection operator (LASSO), and extreme gradient boosting (XGBoost) were used to predict training needs and explore the impact of various multi-dimensional factors. SHapley Additive exPlanations (SHAP) were used to explain the output of the optimal machine learning model.

This study identified the four subgroups of competency patterns, including novice (25.3%), public health experts (15.1%), potential expansion talents (24.7%), and versatile talents (34.9%). Boruta algorithm identified 9 confirmed variables, 3 tentative variables, and 30 rejected variables. Compared with other models, XGBoost model demonstrated the best performance. The value of AUC was 0.702, and the value of accuracy, precision, recall, and F1 score was 0.6485, 0.6564, 0.6301, and 0.6430, respectively. The SHAP based on XGBoost model indicated on-job training satisfaction had a strong association with training needs among public health personnel. Self-improvement needs, college education satisfaction, workload, competency patterns, and team cohesion were also important factors.

Intrinsic motivation is the key factor influencing the training needs of public health personnel. When formulating training plans, priority should be given to how to improve on-job training satisfaction and design a more targeted competency patterns tailored training curriculum. Moreover, organizational incentives aimed at motivating trainees and integrating career development goals into training program design are important. Therefore, setting training priorities becomes key to help ensure that training programs are targeted and effective, thereby promoting individual and organizational career development.

The online version contains supplementary material available at 10.1186/s12889-025-23393-w.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** infectious diseases (MESH:D003141), COVID (MESH:D000086382), LCA (MESH:D000085343)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12150469/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12150469/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12150469/full.md

---
Source: https://tomesphere.com/paper/PMC12150469