Outcome-Aware Spectral Feature Learning for Instrumental Variable Regression
Dimitri Meunier, Jakub Wornbard, Vladimir R. Kostic, Antoine Moulin, Alek Fr\"ohlich, Karim Lounici, Massimiliano Pontil, Arthur Gretton

TL;DR
This paper introduces an outcome-aware spectral feature learning framework for nonparametric instrumental variable regression, improving causal effect estimation by incorporating outcome information into feature learning.
Contribution
It proposes a novel augmented spectral feature learning method that makes feature extraction outcome-aware, addressing limitations of traditional spectral methods.
Findings
Method improves causal effect estimation accuracy.
Framework remains effective under spectral misalignment.
Validated on challenging benchmark datasets.
Abstract
We address the problem of causal effect estimation in the presence of hidden confounders using nonparametric instrumental variable (IV) regression. An established approach is to use estimators based on learned spectral features, that is, features spanning the top singular subspaces of the operator linking treatments to instruments. While powerful, such features are agnostic to the outcome variable. Consequently, the method can fail when the true causal function is poorly represented by these dominant singular functions. To mitigate, we introduce Augmented Spectral Feature Learning, a framework that makes the feature learning process outcome-aware. Our method learns features by minimizing a novel contrastive loss derived from an augmented operator that incorporates information from the outcome. By learning these task-specific features, our approach remains effective even under spectral…
Peer Reviews
Decision·Submitted to ICLR 2026
- The paper introduces a novel “outcome-aware” perspective in spectral feature alignment, an area typically dominated by purely structure-based graph methods. It bridges the gap between spectral methods and outcome-driven modeling, creating a new formulation that incorporates outcome distributions directly into spectral regularization. - The paper presents a solid theoretical foundation, with clear derivations connecting the proposed objective to classical spectral theory and graph Laplacian pro
- The paper offers a rigorous spectral derivation, but lacks a clear theoretical link between outcome-aware alignment and generalization performance. The intuition that aligning features with outcomes leads to better predictive embeddings is compelling but not mathematically formalized. - The experimental evaluation mainly benchmarks against classical spectral and GNN methods (e.g., GCN, LapRLS), but omits comparison with more recent outcome- or label-sensitive graph models. - Although the resul
The paper demonstrates high originality by identifying and tackling a fundamental, previously overlooked flaw in a state-of-the-art method. It formally characterizes the problem of "spectral misalignment," where outcome-agnostic feature learning fails, and introduces a principled solution through the novel concept of an augmented operator and a corresponding contrastive loss. The work is supported by exceptional methodological rigor, combining a compelling theoretical analysis with comprehensiv
While the empirical results are compelling within the constructed synthetic and semi-synthetic frameworks, the practical significance of these benchmarks for real-world causal inference problems remains less clear. The experiments, including the new dSprites benchmark, operate in controlled environments where the core assumptions of the model, such as the validity of the instrumental variable and the specific form of the structural causal model, are guaranteed by design. In practice, these ass
A generalization bound for the two-stage least squares (2SLS) estimator using learned features. An analysis showing robustness of ASFL under spectral misalignment. Empirical validation on synthetic and dSprites-based IV benchmarks demonstrate the performance of the proposed method。 The theoretical novelty and conceptual framing are impactful, even if the experiments are limited.
The tuning parameter δ remains heuristic, with no principled selection strategy, which limits the method’s reproducibility and practical applicability. While theoretically elegant, real-world problems often require capturing multiple outcome-relevant directions. Extending the framework to multi-dimensional or adaptive augmentations would enhance its generality. The experimental evaluation is also limited, with comparisons restricted to DFIV and KIV and no validation on real-world datasets.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Bayesian Modeling and Causal Inference
