# Application Value of an AI-based Imaging Feature Parameter Model for Predicting the Malignancy of Part-solid Pulmonary Nodule

**Authors:** Mingzhi LIN, Yiming HUI, Bin LI, Peilin ZHAO, Zhizhong ZHENG, Zhuowen YANG, Zhipeng SU, Yuqi MENG, Tieniu SONG

PMC · DOI: 10.3779/j.issn.1009-3419.2025.102.13 · Chinese Journal of Lung Cancer · 2025-04-20

## TL;DR

This study uses AI to analyze lung nodule images and finds that a random forest model best predicts whether part-solid nodules are malignant, helping doctors make better treatment decisions.

## Contribution

The novel contribution is developing and evaluating an AI-based imaging model for predicting malignancy in part-solid pulmonary nodules using machine learning techniques.

## Key findings

- Random forest model achieved highest AUC of 0.91 in testing for predicting malignancy in part-solid nodules.
- Key predictive features included roughness (ngtdm), dependence variance (gldm), and short run low gray-level emphasis (glrlm).
- The model provides a reliable tool for clinicians to support personalized treatment decisions for lung nodules.

## Abstract

肺癌是全球最常见的恶性肿瘤之一，也是癌症相关死亡的主要原因。早期肺癌通常表现为肺结节，准确评估其恶性风险对于延长生存期及避免过度诊疗至关重要。本研究旨在基于人工智能（artificial intelligence, AI）自动提取的影像学特征参数构建模型，评估其在部分实性结节（part-solid nodule, PSN）恶性预测中的效能。

回顾性分析2020年10月至2025年2月于兰州大学第二医院接受肺结节切除术的222例患者的229个PSN资料。根据病理结果，将45个良性病变及腺体前驱病变归为非恶性组，184个肺部恶性肿瘤归为恶性组。所有患者均接受胸部计算机断层扫描，使用AI软件提取影像学特征参数。通过单因素分析筛选显著变量，计算方差膨胀因子并剔除共线性较高的变量，LASSO回归进一步筛选关键特征，多因素逻辑回归确定独立危险因素。基于筛选结果，构建逻辑回归、随机森林、XGBoost、LightGBM、支持向量机5种模型，使用受试者工作特征（reciever operating characteristic, ROC）曲线评估模型性能。

PSN良恶性的独立危险因素包括粗糙度（ngtdm）、依赖方差（gldm）和短运行低灰度重点（glrlm）。逻辑回归在训练集和测试集的曲线下面积（area under the curve, AUC）分别为0.86和0.89，表现较好。XGBoost的AUC分别为0.78和0.77，表现相对均衡，但准确度较低。支持向量机在训练集的AUC为0.93，测试集AUC降至0.80，表明该模型存在一定的过拟合。LightGBM在训练集表现优异，AUC为0.94，但在测试集上有所下降，AUC为0.88。随机森林模型在训练集和测试集上均表现稳定，训练集AUC为0.89，测试集AUC为0.91，具有较高的稳定性和良好的泛化能力。

基于独立危险因素构建的随机森林模型在PSN良恶性预测中表现最佳，可以为临床医生提供有效的辅助预测，支持个体化治疗决策。

General clinical data of the two groups

Imaging features parameters of the two groups

Univariate and multivariate Logistic regression

Comparison of the predictive efficacy of five models

## Linked entities

- **Diseases:** lung cancer (MONDO:0005138)

## Full-text entities

- **Diseases:** Malignancy (MESH:D009369), Lung cancer (MESH:D008175), Pulmonary Nodule (MESH:D055613), PSN (MESH:D016606)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12096090/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12096090/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12096090/full.md

---
Source: https://tomesphere.com/paper/PMC12096090