Variance Reduction via Resampling and Experience Replay

Jiale Han; Xiaowu Dai; Yuhua Zhu

arXiv:2502.00520·stat.ML·November 14, 2025

Variance Reduction via Resampling and Experience Replay

Jiale Han, Xiaowu Dai, Yuhua Zhu

PDF

Open Access 1 Video

TL;DR

This paper develops a theoretical framework for experience replay in reinforcement learning, modeling it with resampled U- and V-statistics to guarantee variance reduction and improve stability across various algorithms.

Contribution

It introduces a rigorous variance reduction framework for experience replay, extending its application to policy evaluation and kernel ridge regression, with theoretical guarantees and practical efficiency gains.

Findings

01

Significant variance reduction in policy evaluation tasks.

02

Reduced computational complexity from O(n^3) to O(n^2) in kernel methods.

03

Empirical validation shows improved stability and efficiency.

Abstract

Experience replay is a foundational technique in reinforcement learning that enhances learning stability by storing past experiences in a replay buffer and reusing them during training. Despite its practical success, its theoretical properties remain underexplored. In this paper, we present a theoretical framework that models experience replay using resampled $U$ - and $V$ -statistics, providing rigorous variance reduction guarantees. We apply this framework to policy evaluation tasks using the Least-Squares Temporal Difference (LSTD) algorithm and a Partial Differential Equation (PDE)-based model-free algorithm, demonstrating significant improvements in stability and efficiency, particularly in data-scarce scenarios. Beyond policy evaluation, we extend the framework to kernel ridge regression, showing that the experience replay-based method reduces the computational cost from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Variance Reduction via Resampling and Experience Replay· underline

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Multimedia Communication and Technology