# Explainable Mortality Prediction for Liver Transplant Candidates with Hepatocellular Carcinoma: A Supervised Clustering Approach

**Authors:** Abdelghani Halimi, Nesma Houmani, Sonia Garcia-Salicetti, Ilias Kounis, Audrey Coilly

PMC · DOI: 10.34133/hds.0295 · Health Data Science · 2026-01-13

## TL;DR

This paper introduces a machine learning system that predicts mortality risk for liver transplant candidates with liver cancer, offering clearer insights into risk factors.

## Contribution

The novel supervised clustering approach integrates SHAP explanations with ensemble learning to identify subgroups with distinct mortality risks.

## Key findings

- The proposed system outperforms traditional scores using only 8 key variables identified by SHAP analysis.
- Supervised clustering reveals 7 subgroups with increasing mortality risk and detailed risk factor contributions.

## Abstract

Background: Accurate mortality prediction for liver transplant candidates with hepatocellular carcinoma (HCC) remains a critical challenge. Traditional scoring systems, including Child–Pugh, Albumin–Bilirubin, Model for End-Stage Liver Disease (MELD), MELD-Na, MELD 3.0, and Alpha-fetoprotein scores, are widely used but often fail to provide precise risk assessments. This limitation arises from the dual burden of liver dysfunction and tumor progression, which complicates prognosis. Consequently, there is a need for a comprehensive approach addressing both considerations to better manage HCC patients. Methods: We propose an advanced machine learning-based scoring system exploiting Ensemble Learning and SHapley Additive exPlanations (SHAP) for a better understanding of key mortality risk factors. SHAP offers valuable insights into the decision-making process by providing both global and local explanations. By embedding SHAP values in the Uniform Manifold Approximation and Projection space, we perform supervised clustering to infer latent subgroups, providing a higher granularity on the contribution of key variables for mortality risk assessment. Results: Our system based on LightGBM outperforms conventional scores leveraging only 8 relevant variables selected by SHAP analysis. These variables respond to the challenging dual risk problem set in this work. With supervised clustering, we uncover 7 subgroups showing an increasing mortality risk level and a fine assessment of risk factors’ contribution. Conclusion: By contrast to existing studies, our approach offers an integrative data-driven framework for handling the dual risk challenge set by HCC patients with liver dysfunction. Also, it provides a valuable tool for a more precise risk evaluation that may guide treatment decisions and help monitoring patient progression.

## Linked entities

- **Diseases:** hepatocellular carcinoma (MONDO:0007256)

## Full-text entities

- **Genes:** AFP (alpha fetoprotein) [NCBI Gene 174] {aka AFPD, FETA, HPAFP}
- **Diseases:** liver dysfunction (MESH:D017093), tumor (MESH:D009369), HCC (MESH:D006528), -Stage Liver Disease (MESH:D058625)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12796095/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12796095/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12796095/full.md

---
Source: https://tomesphere.com/paper/PMC12796095