Learning plug-in surrogate endpoints for randomized experiments

Alessandro-Umberto Margueritte; Ahmet Zahid Balc{\i}o\u{g}lu; Jesse Krijthe; Dave Zachariah; Fredrik D. Johansson

arXiv:2605.12051·cs.LG·May 13, 2026

Learning plug-in surrogate endpoints for randomized experiments

Alessandro-Umberto Margueritte, Ahmet Zahid Balc{\i}o\u{g}lu, Jesse Krijthe, Dave Zachariah, Fredrik D. Johansson

PDF

TL;DR

This paper introduces methods for learning surrogate endpoints in randomized experiments that are predictive of long-term outcomes, addressing practical challenges in causal inference.

Contribution

It proposes two new algorithms for learning plug-in surrogate endpoints that maximize effect predictiveness and demonstrates their effectiveness over existing methods.

Findings

01

Our methods outperform established approaches in synthetic and real-world experiments.

02

Plug-in surrogates learned by our methods provide more accurate effect predictions.

03

The approach enables unbiased effect estimation under certain scenarios.

Abstract

Surrogate endpoints are used in place of long-term outcomes in randomized experiments when observing the real outcome for a large enough cohort is prohibitively expensive or impractical. A short-term surrogate is good if the result of an experiment using the surrogate is predictive of the result of a hypothetical study using the real outcome. Much attention has been paid to formalizing this property in causal terms, but most criteria are unidentifiable and cannot be turned into practical algorithms for learning surrogate endpoints from data. To address this, we study plug-in composite surrogates, functions of post-treatment variables that may be substituted directly for the primary outcome in a randomized experiment. We propose two methods for learning plug-in surrogates that maximize effect predictiveness, and characterize the possibility of finding endpoints that yield unbiased effect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.