# Distinguishing nontuberculous mycobacterial lung disease from pulmonary tuberculosis using radiomics machine learning models from CT images

**Authors:** Jiaofeng Zheng, Fang Wang, Xiangxin Zeng, Weiqiang Shu, Zhiyang He, Yurui Li, Yueqin Gao, Shengxiu Lv, Xueyan Liu

PMC · DOI: 10.3389/fmed.2026.1721949 · Frontiers in Medicine · 2026-01-21

## TL;DR

This study develops a machine learning model using CT images to distinguish between non-tuberculous mycobacterial lung disease and pulmonary tuberculosis, showing promising accuracy and outperforming radiologists in some cases.

## Contribution

The novel contribution is the development and evaluation of a YeoJohnson_LR (LDA) radiomics model for differentiating NTM-LD from PTB using CT images.

## Key findings

- The YeoJohnson_LR (LDA) model achieved an accuracy of 0.8286 in an external test cohort.
- The model outperformed radiologists with gains of 3.12–15.62% in sensitivity and 6.85–12.33% in specificity.
- NTM-LD patients were significantly older and showed gender differences compared to PTB patients.

## Abstract

Develop and evaluate a machine learning (ML) model based on CT radiomics for the identification of non-tuberculous mycobacterial lung disease (NTM-LD) and pulmonary tuberculosis (PTB).

Retrospectively, chest CT images with NTM-LD and PTB patients confirmed at Medical Center 1 between January 2019 to December 2024 were collected. The dataset was divided into a training cohort and a validation cohort in a 7:3 ratio. Additionally, patients from medical center 2 were collected for external test. A radiomics model was constructed using five machine learning algorithms: Logistic Regression (LR), Random Forest (RF), Quadratic Discriminant Analysis (QDA), YeoJohnson_LR, and YeoJohnson_LR (LDA). Receiver operating characteristic (ROC) and area under the curve (AUC) were used to evaluate the diagnostic efficacy of the five models, and the optimal prediction model was obtained. The optimal model was compared with three radiologists in the testing cohort.

A total of 1,512 cases were included, including 1,407 cases from Center 1 (NTM-LD: 547; PTB: 860) and 105 patients from Center 2 (NTM-LD: 32; PTB: 73). Patients in the NTM-LD group were significantly older than those in the PTB group (p < 0.001). There was a significant gender difference between the NTM-LD group and the PTB group (p = 0.005). By comparing the five models, it was found that the YeoJohnson_LR (LDA) model was the best-performing prediction model, with an accuracy of 0.8286 in the external test. The AUCs of the YeoJohnson_LR (LDA) model on the training, validation, and test cohort were 0.8421, 0.8037, and 0.8233, respectively. In comparison with radiologists, the YeoJohnson_LR (LDA) model demonstrated gains of 3.12 ~ 15.62% in sensitivity and 6.85 ~ 12.33% in specificity.

The YeoJohnson_LR (LDA) model can be used to distinguish NTM-LD from PTB, assisting in rapid clinical diagnosis and benefiting patients with NTM-LD.

## Linked entities

- **Diseases:** non-tuberculous mycobacterial lung disease (MONDO:0018469), pulmonary tuberculosis (MONDO:0006052)

## Full-text entities

- **Diseases:** PTB (MESH:D014397), NTM-LD (MESH:D008171)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12868291/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12868291/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC12868291/full.md

---
Source: https://tomesphere.com/paper/PMC12868291