Unifying On- and Off-Policy Variance Reduction Methods

Olivier Jeunen

arXiv:2603.08370·stat.ML·March 10, 2026

Unifying On- and Off-Policy Variance Reduction Methods

Olivier Jeunen

PDF

Open Access

TL;DR

This paper establishes a formal equivalence between online and off-policy variance reduction methods, unifying their statistical frameworks and enhancing understanding for practitioners in experimental design.

Contribution

It proves the mathematical equivalence of key variance reduction techniques across online and off-policy settings, bridging a longstanding methodological divide.

Findings

01

Online Difference-in-Means equals off-policy Inverse Propensity Scoring with optimal control variate

02

Regression adjustment methods are structurally equivalent to Doubly Robust estimation

03

Unified framework guides better application of variance reduction techniques

Abstract

Continuous and efficient experimentation is key to the practical success of user-facing applications on the web, both through online A/B-tests and off-policy evaluation. Despite their shared objective -- estimating the incremental value of a treatment -- these domains often operate in isolation, utilising distinct terminologies and statistical toolkits. This paper bridges that divide by establishing a formal equivalence between their canonical variance reduction methods. We prove that the standard online Difference-in-Means estimator is mathematically identical to an off-policy Inverse Propensity Scoring estimator equipped with an optimal (variance-minimising) additive control variate. Extending this unification, we demonstrate that widespread regression adjustment methods (such as CUPED, CUPAC, and ML-RATE) are structurally equivalent to Doubly Robust estimation. This unified view…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Advanced Bandit Algorithms Research · Statistical Methods and Inference