Subgroup Discovery with the Cox Model
Zachary Izzo, Iain Melvin

TL;DR
This paper introduces novel metrics and algorithms for subgroup discovery in survival analysis using Cox models, enabling interpretable and accurate identification of data subsets with improved theoretical and empirical results.
Contribution
It presents the first study on subgroup discovery for Cox models, introducing new quality metrics and algorithms with theoretical guarantees and practical validation.
Findings
Effective recovery of ground-truth subgroups in synthetic data
Improved model fit over naive Cox modeling in real data
Discovered meaningful subgroups in NASA jet engine data
Abstract
We study the problem of subgroup discovery for survival analysis, where the goal is to find an interpretable subset of the data on which a Cox model is highly accurate. Our work is the first to study this particular subgroup problem, for which we make several contributions. Subgroup discovery methods generally require a "quality function" in order to sift through and select the most advantageous subgroups. We first examine why existing natural choices for quality functions are insufficient to solve the subgroup discovery problem for the Cox model. To address the shortcomings of existing metrics, we introduce two technical innovations: the *expected prediction entropy (EPE)*, a novel metric for evaluating survival models which predict a hazard function; and the *conditional rank statistics (CRS)*, a statistical object which quantifies the deviation of an individual point to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Clustering Algorithms Research · Data Quality and Management
