A blood test-based machine learning model for predicting lung cancer risk
Lihi Schwartz, Naor Matania, Matanel Levi, Teddy Lazebnik, Shiri Kushnir, Noga Yosef, Assaf Hoogi, Dekel Shlomi

TL;DR
A machine learning model using blood tests and demographic factors can predict lung cancer risk with moderate accuracy, especially for women and non-smokers.
Contribution
A novel machine learning model that integrates blood test data and sociodemographic factors to predict lung cancer risk.
Findings
The ML model predicted lung cancer with 71.2% accuracy, 63% sensitivity, and 67.2% positive predictive value.
Age was the most significant contributor to the model, followed by red blood cell distribution and creatinine.
Women and never smokers showed higher prediction accuracy compared to men and smokers.
Abstract
The goal of early detection is individual cancer prediction. For lung cancer (LC), age and smoking history are the primary criteria for annual low-dose CT screening, leaving other populations at risk of being overlooked. Machine learning (ML) is a promising method to identify complex patterns in the data that can reveal personalized disease predictors. An ML-based model was used on blood test data collected before the diagnosis of LC, and sociodemographic factors such as age and gender among LC patients and controls were incorporated to predict the risk for future LC diagnosis. In addition to age and gender, we identified 22 blood tests that contributed to the model. For the entire study population, the ML model predicted LC with an accuracy of 71.2%, a sensitivity of 63%, and a positive predictive value of 67.2%. Higher accuracy was found among women than men (71.8 vs. 70.8) and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment · AI in cancer detection
