A General Mixture Loss Function to Optimize a Personalized Predictive Model

Tatiana Krikella; Joel A. Dubin

arXiv:2601.20788·stat.ME·January 30, 2026

A General Mixture Loss Function to Optimize a Personalized Predictive Model

Tatiana Krikella, Joel A. Dubin

PDF

Open Access

TL;DR

This paper introduces a flexible, generalized loss function for personalized prediction models that optimizes subpopulation size to improve model discrimination and calibration in precision medicine.

Contribution

It proposes a novel loss function allowing joint optimization of discrimination and calibration, with practical guidelines for subpopulation size selection in PPMs.

Findings

01

Optimal subpopulation size ranges between 20% and 70% of training data.

02

Choice of performance measure affects the optimal subpopulation size.

03

The method improves model performance in simulated and real datasets.

Abstract

Advances in precision medicine increasingly drive methodological innovation in health research. A key development is the use of personalized prediction models (PPMs), which are fit using a similar subpopulation tailored to a specific index patient, and have been shown to outperform one-size-fits-all models, particularly in terms of model discrimination performance. We propose a generalized loss function that enables tuning of the subpopulation size used to fit a PPM. This loss function allows joint optimization of discrimination and calibration, allowing both the performance measures and their relative weights to be specified by the user. To reduce computational burden, we conducted extensive simulation studies to identify practical bounds for the grid of subpopulation sizes. Based on these results, we recommend using a lower bound of 20\% and an upper bound of 70\% of the entire…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Statistical Methods and Inference · AI in cancer detection