Good Rankings, Wrong Probabilities: A Calibration Audit of Multimodal Cancer Survival Models

Sajad Ghawami

arXiv:2604.04239·cs.LG·April 7, 2026

Good Rankings, Wrong Probabilities: A Calibration Audit of Multimodal Cancer Survival Models

Sajad Ghawami

PDF

TL;DR

This paper systematically audits the calibration of multimodal cancer survival models, revealing widespread miscalibration despite strong discriminative performance, and evaluates methods to improve calibration.

Contribution

First comprehensive fold-level calibration audit of multimodal WSI-genomics survival models, highlighting calibration issues and evaluating post-hoc calibration methods.

Findings

01

All tested models fail 1-calibration on most folds.

02

Gating-based fusion improves calibration over other methods.

03

Platt scaling reduces miscalibration without harming discrimination.

Abstract

Multimodal deep learning models that fuse whole-slide histopathology images with genomic data have achieved strong discriminative performance for cancer survival prediction, as measured by the concordance index. Yet whether the survival probabilities derived from these models - either directly from native outputs or via standard post-hoc reconstruction - are calibrated remains largely unexamined. We conduct, to our knowledge, the first systematic fold-level 1-calibration audit of multimodal WSI-genomics survival architectures, evaluating native discrete-time survival outputs (Experiment A: 3 models on TCGA-BRCA) and Breslow-reconstructed survival curves from scalar risk scores (Experiment B: 11 architectures across 5 TCGA cancer types). In Experiment A, all three models fail 1-calibration on a majority of folds (12 of 15 fold-level tests reject after Benjamini-Hochberg correction).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.