Learning Treatment Representations for Downstream Instrumental Variable Regression

Shiangyi Lin; Hui Lan; Vasilis Syrgkanis

arXiv:2506.02200·cs.LG·June 25, 2025

Learning Treatment Representations for Downstream Instrumental Variable Regression

Shiangyi Lin, Hui Lan, Vasilis Syrgkanis

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a novel method for learning treatment representations in instrumental variable regression that explicitly incorporates instruments during representation learning, improving identification and reducing bias in high-dimensional settings.

Contribution

The paper proposes a new approach to construct treatment representations by integrating instrumental variables into the learning process, addressing limitations of traditional dimension reduction methods.

Findings

01

Instrument-informed representations improve outcome prediction.

02

The method reduces omitted variable bias in high-dimensional IV regression.

03

Empirical results outperform conventional two-stage approaches.

Abstract

Traditional instrumental variable (IV) estimators face a fundamental constraint: they can only accommodate as many endogenous treatment variables as available instruments. This limitation becomes particularly challenging in settings where the treatment is presented in a high-dimensional and unstructured manner (e.g. descriptions of patient treatment pathways in a hospital). In such settings, researchers typically resort to applying unsupervised dimension reduction techniques to learn a low-dimensional treatment representation prior to implementing IV regression analysis. We show that such methods can suffer from substantial omitted variable bias due to implicit regularization in the representation learning step. We propose a novel approach to construct treatment representations by explicitly incorporating instrumental variables during the representation learning process. Our approach…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

- The IV setting with potentially high dimensional confounded treatments is an important and underexplored research direction in causal inference research. - The motivation showing that omitted variable bias can also come from dimensionality reduction/representation learning of treatments (and not only of cofounders like most previous work showed) is an interesting and important finding - Their method, ensuring that the IV information is maintained in the treatment representation, is a nice and

Weaknesses

The clarity of the paper could be improved and implications of different parts of the method could be mentioned. - Part of the main motivation is that IV estimators “can only accommodate as many endogenous treatment variables as available instruments”. I think while in general it makes sense that effect estimation with high dimensional treatments is challenging, this statement should be explained more and shown in more detail. - I think the related work is a bit limited to some specific works an

Reviewer 02Rating 4Confidence 2

Strengths

The authors pinpoint how standard unsupervised dimensionality reduction of high dimensional treatments can violate the exclusion restriction by discarding instrument driven variation, a problem they term omitted treatment bias. The proposed solution of instrument guided representation learning represents a creative fusion of causal inference and representation learning, directly addressing this limitation in prior two stage approaches. The authors develop a complete framework with specialized m

Weaknesses

The theoretical foundation of this paper relies on a set of strong structural assumptions that may be difficult to satisfy in real world applications. A key example is the core assumption of joint independence. This assumption requires the instrument, the confounder representation, and the orthogonal components to be fully independent. This is particularly challenging to guarantee with high dimensional and complex data. Furthermore, prerequisites such as the invertibility of the encoding and dec

Reviewer 03Rating 2Confidence 4

Strengths

- The problem setting is interesting and novel to the best of my knowledge

Weaknesses

- Line 67: The paper states that causal representation learning is uncommon in causal inference, but common in causal discovery. This is not true. The paper thus lacks a sufficient discussion of related work on representation learning in causal inference tasks. - The method requires very strong untestable assumptions, limiting its applicability in practice. - The paper neither contains a discussion on the method, its limitations, or the results, nor a conclusion, leaving it an unfinished work.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Machine Learning and Algorithms