Identifying treatment response subgroups in observational time-to-event data
Vincent Jeanselme, Chang Ho Yoon, Fabian Falck, Brian Tom, Jessica Barrett

TL;DR
This paper presents a new outcome-guided subgroup analysis method for identifying patient groups with different treatment responses in both RCTs and observational studies, addressing biases and improving clinical insights.
Contribution
It introduces a novel subgroup analysis strategy that works across study types, bridging individualised and average treatment effect estimation for better clinical decision-making.
Findings
Outperforms current state-of-the-art methods in experiments
Effective in both RCTs and observational studies
Uncovers clinically relevant subgroups with distinct responses
Abstract
Identifying patient subgroups with different treatment responses is an important task to inform medical recommendations, guidelines, and the design of future clinical trials. Existing approaches for treatment effect estimation primarily rely on Randomised Controlled Trials (RCTs), which tend to feature more homogeneous patient groups, making them less relevant for uncovering subgroups in the population encountered in real-world clinical practice. Subgroup analyses established for RCTs suffer from significant statistical biases when applied to observational studies, which benefit from larger and more representative populations. Our work introduces a novel, outcome-guided, subgroup analysis strategy for identifying subgroups of treatment response in both RCTs and observational studies alike. It hence positions itself in-between individualised and average treatment effect estimation to…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
Heterogeneity of treatment effects is an important problem. Authors study this problem in the context of observational studies and under censoring. The work is well motivated and positioned in the relevant literature. Although not very novel component-wise, the end-to-end model architecture may be valuable for future work.
My main criticism toward the work is that it is too assumption heavy. Authors assume observational study is perfect for causal analysis (e.g., no unmeasured confounding) and that the censoring time is conditionally independent of the time-to-event variable. On top of those, the assumption that the number of subgroups is known a priori makes the setting too simple/unrealistic. While this may not necessarily be a reason for criticism, the results derived in this work follow rather straighforward
- The paper addresses an interesting topic in causal inference of clinical relevance going beyond classical CATE estimation to handle subgroup identification and time to event data from RWD / observational data. - The combination of the building blocks is comprehensibly explained and mostly well-motivated - The authors provide code for reproducibility
- The structure of the paper could be clearly improved to clarify the exact research gap, comparison to existent works, and contribution: - In the introduction (line 44-52), method section (l. 201-255), and related work (l.472-479), this paper references mostly CATE methods in the static setting and thus neglecting to explain the connection to CATE methods on survival data (e.g., SurvITE, etc.) or subgroup identification for CATE which makes it hard to judge the novelty of the work. - Th
1. Unlike traditional methods that often assume linear treatment responses, the proposed model employs neural networks to allow for non-linear survival functions, making it more adaptable to real-world applications. 2. The integration of monotonic neural networks with IPW is both technically advanced and practically relevant for observational study analysis. This approach helps correct non-random assignment bias without requiring parameterization of the underlying survival distributions.
1. The main limitation lies in the interpretability of subgroups, especially regarding how these identified subgroups can be understood and applied by clinical practitioners. For example, in Appendix C, it would be helpful to know the characteristics of the two subgroups. If a patient presents with specific features, how would doctors classify them into a subgroup? Additionally, how statistically stable are these subgroups? 2. The presentation of the paper is somewhat tedious. The problem setup
- The work focuses on model development on the observational data which is vital in this field as clinical trial data are expensive and limited. The study has considered removing potential biases in the data. - The study models on the survival data, which are prevalent in the medical field, and the proposed method also consider the censoring event. - The study evaluates the proposed methods in terms of accuracy in subgroup identification and treatment effect estimation, which is solid. - The st
- There was one famous work called "Deepsurv"[1] in this field which was published in 2018. Although it does not consider subgroup identification, but I think you should compare your work with theirs in terms of IAE. - For identifying the subgroup analysis, the work states that it utilizes a vector $l_k$ (please consider change the notation to $v$ as in your previous sections $l$ is used for log-liklihood), however, there is no description how you got the $l_k$. Is $l_k$ a trainable parameters
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Advanced Causal Inference Techniques · Statistical Methods and Inference
MethodsFocus
