# Machine learning models for predicting hepatocellular carcinoma development in patients with chronic viral hepatitis B infection

**Authors:** Warissara Kuaaroon, Thodsawit Tiyarattanachai, Terapap Apiparakoon, Sanparith Marukatat, Natthaporn Tanpowpong, Sombat Treeprasertsuk, Rungsun Rerknimitr, Pisit Tangkijvanich, Prooksa Ananchuensook, Watcharasak Chotiyaputta, Kittichai Samaithongcharoen, Roongruedee Chaiteerakij

PMC · DOI: 10.2478/abm-2025-0007 · Asian Biomedicine: Research, Reviews and News · 2025-02-28

## TL;DR

This paper develops machine learning models to predict the risk of liver cancer in patients with chronic hepatitis B, aiming to improve surveillance strategies.

## Contribution

The study introduces novel machine learning models that use patient data to predict short-term liver cancer risk in hepatitis B patients.

## Key findings

- Models including alpha fetoprotein (AFP) showed higher sensitivity and specificity for predicting hepatocellular carcinoma (HCC) compared to models without AFP.
- The selected features model with AFP (SFA) performed better than the selected features model without AFP (SFN) in both derivation and independent cohorts.

## Abstract

Chronic hepatitis B (CHB) infection is the major risk factor for hepatocellular carcinoma (HCC).

To develop machine-learning models for predicting an individual risk of HCC development in CHB-infected patients.

Machine learning models were constructed using features from follow-up visits of CHB patients to predict the diagnosis of HCC development within 6 months after each index follow-up. We developed 4 model variants using all features, with alpha fetoprotein (AFP) (AF A) and without AFP (AFN); and selected features, with AFP (SF A) and without AFP (SFN). Performance was evaluated using 10-fold cross-validation on a derivation cohort and further validated on an independent cohort.

In the derivation cohort of 2,382 patients, of whom 117 developed HCC, AFA achieved higher sensitivity (0.634, 95% confidence interval [CI]: 0.559–0.708) and specificity (0.836; 0.830–0.842) than AF N (sensitivity 0.553; 0.476–0.630 and specificity 0.786; 0.779–0.792). SFA also achieved higher sensitivity (0.683; 0.611–0.755 vs. 0.658; 0.585–0.732) and specificity (0.756; 0.749–0.763 vs. 0.744; 0.737–0.751) than SFN. Performance of SFA and SFN were tested in another cohort of 162 patients in which 57 patients developed HCC. SFA achieved sensitivity and specificity of 0.634 (0.522–0.746) and 0.657 (0.615–0.699), while sensitivity and specificity of SFN were 0.690 (0.583–0.798) and 0.651 (0.609–0.693), respectively.

The machine learning models demonstrate good performance for predicting short-term risk for HCC development and may potentially be used for tailoring surveillance interval for CHB patients.

## Linked entities

- **Diseases:** hepatocellular carcinoma (MONDO:0007256), chronic hepatitis B (MONDO:0005344)

## Full-text entities

- **Genes:** AFP (alpha fetoprotein) [NCBI Gene 174] {aka AFPD, FETA, HPAFP}
- **Diseases:** HCC (MESH:D006528), chronic viral hepatitis B infection (MESH:D014777), infected (MESH:D007239), CHB (MESH:D019694)
- **Chemicals:** SFN (MESH:D000077157)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11994220/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC11994220/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC11994220/full.md

---
Source: https://tomesphere.com/paper/PMC11994220