Machine Learning for Classification in Lung Cancer Using Routine Clinical and Laboratory Data
Chang Liu, YuLin Liao, Dongsheng Wang, Jie Yang, Liwei Zhao, Xiaoling Liu, Zuo Wang, Lichun Wu

TL;DR
This study developed a machine learning model using clinical and lab data to classify lung cancer subtypes non-invasively, achieving high accuracy and creating a web tool for clinical use.
Contribution
A novel non-invasive machine learning model for lung cancer classification using routine clinical data and an accessible web-based deployment tool.
Findings
The RandomForest model achieved an AUC of 0.999 in the training set and 0.969 in the test set.
Sex and tumor markers were identified as significant predictors for lung cancer classification.
A web-based calculator was developed for real-time clinical application.
Abstract
Accurate pathological classification of lung cancer is essential for informing treatment strategies. However, invasive biopsy procedures are not feasible for high-risk patients or those with inaccessible lesions. This study aimed to develop a machine learning model utilizing routine clinical and laboratory data for classification of non-invasive lung cancer. Data from patients admitted to Sichuan Provincial Cancer Hospital were retrospectively analyzed. Key features were determined using LASSO and Boruta algorithms. Four machine learning models, including logistic regression, extreme gradient boosting (XGBoost), categorical boosting (CatBoost), and random forest (RandomForest), were trained and optimized through five-fold cross-validation. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), accuracy, and F1 score. An online calculator…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLung Cancer Diagnosis and Treatment · Radiomics and Machine Learning in Medical Imaging · Lung Cancer Research Studies
