A Two-armed Bandit Framework for A/B Testing

Jinjuan Wang; Qianglin Wen; Yu Zhang; Xiaodong Yan; Chengchun Shi

arXiv:2507.18118·stat.ML·July 25, 2025

A Two-armed Bandit Framework for A/B Testing

Jinjuan Wang, Qianglin Wen, Yu Zhang, Xiaodong Yan, Chengchun Shi

PDF

Open Access

TL;DR

This paper introduces a novel two-armed bandit framework for A/B testing that enhances statistical power by combining doubly robust estimation, bandit-based test statistics, and permutation methods, validated through theory, simulations, and real data.

Contribution

It presents a new testing procedure integrating bandit algorithms with causal inference techniques to improve A/B test power over existing methods.

Findings

01

Demonstrates superior performance in simulations

02

Shows effectiveness on real ridesharing data

03

Provides asymptotic theoretical guarantees

Abstract

A/B testing is widely used in modern technology companies for policy evaluation and product deployment, with the goal of comparing the outcomes under a newly-developed policy against a standard control. Various causal inference and reinforcement learning methods developed in the literature are applicable to A/B testing. This paper introduces a two-armed bandit framework designed to improve the power of existing approaches. The proposed procedure consists of three main steps: (i) employing doubly robust estimation to generate pseudo-outcomes, (ii) utilizing a two-armed bandit framework to construct the test statistic, and (iii) applying a permutation-based method to compute the $p$ -value. We demonstrate the efficacy of the proposed method through asymptotic theories, numerical experiments and real-world data from a ridesharing company, showing its superior performance in comparison to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning