# Traditional Cox regression outperforms large language models in predicting long-term progression of intermediate to advanced hepatocellular carcinoma

**Authors:** Kang Li, Chen Wang, Yiqi Xiong, Yi Song, Yubin Zhang, Danlei Mou, Caixia Hu, Dandan Guo, Tingting Mei, Ang Li, Yonghong Zhang

PMC · DOI: 10.3389/fonc.2026.1710529 · Frontiers in Oncology · 2026-01-29

## TL;DR

Traditional statistical models outperformed large language models in predicting the progression of advanced liver cancer over time.

## Contribution

Demonstrated that traditional Cox regression models outperform LLMs in long-term hepatocellular carcinoma progression prediction.

## Key findings

- Cox regression models showed better predictive accuracy than LLMs for 12, 24, and 36-month progression risk.
- Combining ablation and immune checkpoint inhibitors with standard treatment improved progression-free survival.
- LLMs underperformed except for DeepSeek R1 at 12 and 24 months in the training cohort.

## Abstract

This study aimed to evaluate and compare the performance of large language models (LLMs) and traditional Cox regression models in predicting the long-term progression risk in patients with intermediate to advanced hepatocellular carcinoma (HCC).

A total of 576 patients with intermediate to advanced HCC were included, comprising a training cohort (n = 403) and a validation cohort (n = 173) for model development and validation. We evaluated the predictive performance of LLMs (DeepSeek R1, DeepSeek V3, and Qwen/QWQ-32B) and the traditional Cox regression model for estimating the progression risk of HCC at 12, 24, and 36 months. Time-dependent area under the curve (AUC), decision curve analysis, calibration curve, net reclassification improvement, and integrated discrimination improvement were used to comprehensively assess model performance.

Based on transarterial chemoembolization combined with targeted therapy, the addition of immune checkpoint inhibitors (ICIs) and/or ablation prolonged the progression-free survival (PFS): all four treatments combined showed optimal outcome (median PFS = 12.3 months, 95%CI = 9.9–14.1). Univariate and multivariate Cox analyses identified independent prognostic factors, which were utilized to develop a progression risk nomogram. The model had good discrimination, with training cohort AUCs (at 12, 24, and 36 months) of 0.72 (95%CI = 0.67–0.78), 0.77 (95%CI = 0.69–0.86), and 0.96 (95%CI = 0.93–0.99), respectively, and validation cohort AUCs of 0.75 (95%CI = 0.67–0.83), 0.81 (95%CI = 0.71–0.91), and 0.97 (95%CI = 0.94–1.0), respectively. Three LLMs were evaluated on the same dataset. Except for DeepSeek R1 at 12 and 24 months (training cohort), all LLMs underperformed the Cox model across time points, indicating current limitations in predicting long-term progression risk.

The combination of ablation and/or ICIs with standard treatment could prolong PFS. In predicting the long-term HCC progression risk, the traditional Cox model exceeded the LLMs. Their combination may merge structured modeling stability with the multi-source data processing capacity of LLMs, potentially improving prediction accuracy.

## Linked entities

- **Diseases:** hepatocellular carcinoma (MONDO:0007256)

## Full-text entities

- **Diseases:** HCC (MESH:D006528)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12893998/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12893998/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12893998/full.md

---
Source: https://tomesphere.com/paper/PMC12893998