A General Mixture Loss Function to Optimize a Personalized Predictive Model
Tatiana Krikella, Joel A. Dubin

TL;DR
This paper introduces a flexible, generalized loss function for personalized prediction models that optimizes subpopulation size to improve model discrimination and calibration in precision medicine.
Contribution
It proposes a novel loss function allowing joint optimization of discrimination and calibration, with practical guidelines for subpopulation size selection in PPMs.
Findings
Optimal subpopulation size ranges between 20% and 70% of training data.
Choice of performance measure affects the optimal subpopulation size.
The method improves model performance in simulated and real datasets.
Abstract
Advances in precision medicine increasingly drive methodological innovation in health research. A key development is the use of personalized prediction models (PPMs), which are fit using a similar subpopulation tailored to a specific index patient, and have been shown to outperform one-size-fits-all models, particularly in terms of model discrimination performance. We propose a generalized loss function that enables tuning of the subpopulation size used to fit a PPM. This loss function allows joint optimization of discrimination and calibration, allowing both the performance measures and their relative weights to be specified by the user. To reduce computational burden, we conducted extensive simulation studies to identify practical bounds for the grid of subpopulation sizes. Based on these results, we recommend using a lower bound of 20\% and an upper bound of 70\% of the entire…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Statistical Methods and Inference · AI in cancer detection
