Direct Doubly Robust Estimation of Conditional Quantile Contrasts
Josh Givens, Song Liu, Henry W J Reeve, Katarzyna Reluga

TL;DR
This paper introduces a direct estimation method for the conditional quantile comparator (CQC) in heterogeneous treatment effect analysis, improving interpretability and accuracy over existing indirect methods.
Contribution
It proposes the first direct estimator for CQC, enabling explicit modeling, better interpretability, and improved estimation error bounds with double robustness.
Findings
Outperforms existing methods in estimation accuracy across various data scenarios.
Retains double robustness property for nuisance parameter estimation.
Applied to real-world employment data, revealing age-related effects on earnings.
Abstract
Within heterogeneous treatment effect (HTE) analysis, various estimands have been proposed to capture the effect of a treatment conditional on covariates. Recently, the conditional quantile comparator (CQC) has emerged as a promising estimand, offering quantile-level summaries akin to the conditional quantile treatment effect (CQTE) while preserving some interpretability of the conditional average treatment effect (CATE). It achieves this by summarising the treated response conditional on both the covariates and the untreated response. Despite these desirable properties, the CQC's current estimation is limited by the need to first estimate the difference in conditional cumulative distribution functions and then invert it. This inversion obscures the CQC estimate, hampering our ability to both model and interpret it. To address this, we propose the first direct estimator of the CQC,…
Peer Reviews
Decision·ICLR 2026 Poster
The paper presents the first direct doubly robust estimator for the Conditional Quantile Contrast (CQC), offering a method that effectively connects theoretical causal inference principles with practical implementation. It contributes new finite-sample and convergence bounds, extending robustness theory in heterogeneous treatment effect estimation. The proposed algorithm is clearly articulated through a gradient-based optimization procedure (Algorithm 1), making it both conceptually transparent
1. The real-world data analysis, while effectively illustrating the interpretability of the proposed estimator, lacks quantitative comparisons to other causal inference methods. The employment dataset experiment focuses on qualitative visualization of treatment heterogeneity but does not benchmark performance against either inversion-based CQC estimators or widely used CATE-based models such as TARNet, DragonNet, or BART. Including such comparisons would provide essential empirical context, clar
1. The proposed framework enables parametrization of the CQC function, providing a means to enforce structural assumptions on the model and to represent the estimation error in terms of the complexity of the CQC itself. 2. The idea of transforming the optimization scheme into a convex optimiza- tion problem by introducing a loss function whose derivative with respect to y1 is the contrasting function was fascinating. 3. Empirical results demonstrate improved performance.
1. The term ”direct CQC estimator” seems to be somewhat misleading, as the proposed estimator in Section 3.1 is defined with respect to the gradient rather than the estimand of interest, $g^∗$. 2. Also related to the point made in 1, and as the authors acknowledged in the limitations already, the doubly robustness proposed in Theorem 3 holds with respect to the loss function rather than the CQC estimate $g_\hat{\theta}$. As the estimand of interest is $g^∗$, it seems imperative to demonstrate t
1. This work provides a practical, direct, and more efficient alternative, making the CQC a much more usable tool. 2. The core method is novel. The idea of framing the CQC estimation problem as an M-estimation task by defining a loss $l$ such that $\partial_{y_1}l = h$ (with $h=F_1 - F_0$) is very interesting. Deriving the doubly-robust gradient $\zeta_{dr}$ (Proposition 2) provides a new class of estimators. 3. The method is theoretically solid. The paper provides finite sample bounds that fo
1. Practicality of model selection. The paper notes as a limitation that there is "no natural definition of test loss". The method optimizes based on an estimated gradient of the population loss, not a sample-based loss (like MSE). This makes standard validation and hyperparameter tuning very difficult. The paper suggests an approximation via quadrature (Appendix B.2), but this is complex and a significant practical barrier. 2. The algorithm (Algorithm 1) requires sampling test points $Y_0$ to
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
