The surprising efficiency of temporal difference learning for rare event   prediction

Xiaoou Cheng; Jonathan Weare

arXiv:2405.17638·cs.LG·January 17, 2025

The surprising efficiency of temporal difference learning for rare event prediction

Xiaoou Cheng, Jonathan Weare

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that least-squares TD learning significantly outperforms Monte Carlo methods in efficiently estimating rare event probabilities in Markov chains, especially when high accuracy is required.

Contribution

The paper provides theoretical analysis showing LSTD achieves better relative accuracy with fewer samples than MC in rare event policy evaluation.

Findings

01

LSTD maintains fixed relative accuracy with polynomially many samples.

02

A central limit theorem for LSTD is established.

03

Upper bounds on relative asymptotic variance are derived.

Abstract

We quantify the efficiency of temporal difference (TD) learning over the direct, or Monte Carlo (MC), estimator for policy evaluation in reinforcement learning, with an emphasis on estimation of quantities related to rare events. Policy evaluation is complicated in the rare event setting by the long timescale of the event and by the need for \emph{relative accuracy} in estimates of very small values. Specifically, we focus on least-squares TD (LSTD) prediction for finite state Markov chains, and show that LSTD can achieve relative accuracy far more efficiently than MC. We prove a central limit theorem for the LSTD estimator and upper bound the \emph{relative asymptotic variance} by simple quantities characterizing the connectivity of states relative to the transition probabilities between them. Using this bound, we show that, even when both the timescale of the rare event and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The surprising efficiency of temporal difference learning for rare event prediction· slideslive

Taxonomy

TopicsMachine Learning in Healthcare

MethodsFocus