Practical Improvements of A/B Testing with Off-Policy Estimation

Otmane Sakhi; Alexandre Gilotte; David Rohde

arXiv:2506.10677·stat.ML·June 16, 2025

Practical Improvements of A/B Testing with Off-Policy Estimation

Otmane Sakhi, Alexandre Gilotte, David Rohde

PDF

Open Access

TL;DR

This paper proposes a family of unbiased off-policy estimators for A/B testing that significantly reduce variance compared to traditional methods, especially when the tested systems are similar, validated through theory and experiments.

Contribution

Introduces a new family of unbiased off-policy estimators for A/B testing that achieve lower variance than standard difference-in-means estimators.

Findings

01

The proposed estimator reduces variance substantially in similar systems.

02

The estimator is simple and practical to implement.

03

Theoretical analysis and experiments confirm effectiveness.

Abstract

We address the problem of A/B testing, a widely used protocol for evaluating the potential improvement achieved by a new decision system compared to a baseline. This protocol segments the population into two subgroups, each exposed to a version of the system and estimates the improvement as the difference between the measured effects. In this work, we demonstrate that the commonly used difference-in-means estimator, while unbiased, can be improved. We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach. Among this family, we identify the estimator with the lowest variance. The resulting estimator is simple, and offers substantial variance reduction when the two tested systems exhibit similarities. Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Statistical Methods in Clinical Trials · SARS-CoV-2 detection and testing