Rethinking external validation for the target population: Capturing patient-level similarity with a generative model

Mohammad Azizmalayeri; Ameen Abu-Hanna; Saskia Houterman; Marije M. Vis; Giovanni Cin\`a (on behalf of the NHR THI registration committee)

arXiv:2605.11284·stat.ME·May 13, 2026

Rethinking external validation for the target population: Capturing patient-level similarity with a generative model

Mohammad Azizmalayeri, Ameen Abu-Hanna, Saskia Houterman, Marije M. Vis, Giovanni Cin\`a (on behalf of the NHR THI registration committee)

PDF

TL;DR

This paper introduces a generative model-based framework to improve external validation of predictive models by assessing patient similarity and population differences, enhancing interpretability and transportability insights.

Contribution

It proposes a novel autoencoder-based similarity measure that distinguishes model deficiencies from case-mix effects without sharing original data.

Findings

01

Substantial performance variation across similarity-defined subgroups.

02

Conventional validation can mask clinically relevant performance deficits.

03

Framework links model performance to population alignment, aiding transportability decisions.

Abstract

Background: External validation is essential for assessing the transportability of predictive models. However, its interpretation is often confounded by differences between external and development populations. This study introduces a framework to distinguish model deficiencies from case-mix effects. Method: We propose a framework that quantifies each external patient's similarity to the development data and measures performance in subgroups with varying levels of alignment to the development distribution. We use generative models, specifically autoencoders, to estimate similarity, offering a more flexible alternative to traditional linear approaches and enabling validation without sharing the original development data. The utility of autoencoder-based similarity measure is demonstrated using synthetic data, and the framework's application is illustrated using data from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.