# Machine learning prediction of pharmacist intervention benefit in tuberculosis patients using clinical parameters: a single-center retrospective study

**Authors:** Tingting Li, Huanqing Liu, Zhuhong You, Qian Lei

PMC · DOI: 10.3389/fcimb.2026.1749499 · Frontiers in Cellular and Infection Microbiology · 2026-02-04

## TL;DR

A machine learning model was developed to predict which tuberculosis patients would benefit from pharmacist interventions, using clinical data to improve treatment outcomes.

## Contribution

A novel machine learning model with extensive optimization and data augmentation to predict pharmacist intervention benefit in TB patients.

## Key findings

- The model achieved 92.25% accuracy and 96.96% AUC-ROC in predicting pharmacist intervention group assignment.
- Transcriptomic analysis revealed 150 differentially expressed genes linked to immune response and inflammation pathways.
- Data augmentation and feature engineering significantly improved model performance despite limited sample size.

## Abstract

Tuberculosis (TB) remains a major global health challenge, with an estimated 10 million new cases and 1.4 million deaths annually. Identifying patients who would benefit from comprehensive pharmacist intervention services is critical for optimizing pharmacist intervention benefit outcomes and resource allocation. We developed a machine learning model to predict pharmacist intervention group assignment at hospital admission using clinical parameters.

We conducted a retrospective analysis of 467 TB patients from a tertiary care hospital. The prediction model was trained exclusively on clinical variables to predict pharmacist intervention group assignment (binary classification: intervention group = 1, control group = 0). To address limited sample size, we implemented data augmentation using multi-neighbor interpolation, expanding the dataset to 1,999 samples (328.1% increase). We developed an extensive feature engineering pipeline generating 122 optimized features and employed Optuna-based hyperparameter optimization (250 trials) with a multi-level ensemble architecture comprising 43 base models. Separately, we analyzed publicly available GEO datasets to provide biological interpretation and mechanistic insights, but these transcriptomic data were not used as model features.

The ultimate ensemble model achieved accuracy of 92.25% (95% CI: 89.1-95.4%) and AUC-ROC of 96.96% (95% CI: 94.8-99.1%) in predicting pharmacist intervention group assignment, demonstrating the substantial impact of the optimization strategies employed. Analysis of GEO datasets identified 150 significantly differentially expressed genes (FDR < 0.05) and revealed enrichment in immune response and inflammation pathways, providing supportive biological context for the clinical prediction model.

Our study demonstrates that comprehensive machine learning optimization can achieve strong predictive performance for identifying patients who would benefit from pharmacist intervention. The clinical prediction model, trained exclusively on clinical variables, provides a robust framework for personalized TB treatment resource allocation. Supportive transcriptomic analyses provide biological context but are not used in model prediction. The model’s accuracy (92.25%) and discriminative ability (AUC 96.96%) suggest potential for clinical implementation.

## Linked entities

- **Diseases:** Tuberculosis (MONDO:0018076)

## Full-text entities

- **Genes:** CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, CRP (C-reactive protein) [NCBI Gene 1401] {aka PTX1}, SLC17A5 (solute carrier family 17 member 5) [NCBI Gene 26503] {aka AST, ISSD, NSD, SD, SIALIN, SIASD}
- **Diseases:** HIV (MESH:D015658), infectious (MESH:D003141), drug reactions (MESH:D004342), Mycobacterium tuberculosis (MESH:D014376), infected (MESH:D007239), extensively drug-resistant TB (MESH:D054908), end-stage renal disease (MESH:D007676), deaths (MESH:D003643), diabetes (MESH:D003920), cancer (MESH:D009369), pulmonary or extrapulmonary TB (MESH:D000092225), HL (MESH:C538324), inflammation (MESH:D007249), cirrhosis (MESH:D005355), MDR-TB (MESH:D018088)
- **Chemicals:** MLP (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12913530/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12913530/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC12913530/full.md

---
Source: https://tomesphere.com/paper/PMC12913530