Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1
Joongwon Chae, Zhenyu Wang, Duanpo Wu, Lian Zhang, Alexander Tuzikov, Magrupov Talat Madiyevich, Min Xu, Dongmei Yu, Peiwu Qin

TL;DR
This study found that blood biomarkers and machine learning models poorly distinguish early stages of gallbladder cancer, while a large language model using radiology reports achieved high accuracy.
Contribution
Demonstrated the superior performance of a large language model over biomarker-based machine learning for T-stage discrimination in gallbladder cancer.
Findings
Blood biomarker-based machine learning models showed poor T-stage discrimination, with AUROC near random chance.
DeepSeek-R1 achieved 89.6% accuracy using radiology reports alone, with no improvement from adding biomarker data.
SMOTE improved cross-validation accuracy but did not enhance test set performance for biomarker models.
Abstract
Gallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (DeepSeek-R1) when supplied with (a) radiology-report text alone versus (b) radiology-report text plus blood-biomarker values. We retrospectively analyzed 232 pathologically confirmed GBC patients treated at Lishui Central Hospital between 2023 and 2024 (T1, n = 51; T2, n = 181). Seven blood variables—neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), platelet-tolymphocyte ratio (PLR), carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 125 (CA125), and alpha-fetoprotein (AFP)—were used to train Random forest, Support Vector Machine (SVC),…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCholangiocarcinoma and Gallbladder Cancer Studies · Radiomics and Machine Learning in Medical Imaging · Pancreatic and Hepatic Oncology Research
