A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning

Ege C. Kaya; Abolfazl Hashemi

arXiv:2605.06866·cs.LG·May 11, 2026

A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning

Ege C. Kaya, Abolfazl Hashemi

PDF

TL;DR

This paper develops a finite-iteration theoretical framework for asynchronous categorical distributional temporal-difference learning, bridging the gap between theory and practical implementations.

Contribution

It introduces a finite-iteration analysis for asynchronous categorical TD methods, applicable to both scalar and multivariate settings, under various sampling regimes.

Findings

01

Provides finite-iteration guarantees for asynchronous categorical TD algorithms.

02

Establishes contraction properties in a statewise supremum norm after isometric embeddings.

03

Applicable to both discounted and undiscounted fixed-horizon problems under different sampling assumptions.

Abstract

Recent non-asymptotic analyses have substantially advanced the theory of distributional policy evaluation, but they largely concern synchronous full-state updates under a generative model, model-based estimators, accelerated variants, or different approximation architectures. Standard categorical temporal-difference learning is typically used in a different regime. It asynchronously performs a single-state update at each iteration and, in online settings, is driven by a Markovian trajectory. This leaves an important gap between existing finite-iteration theory and the categorical recursions most closely aligned with practical distributional temporal-difference implementations. We bridge this gap for two categorical policy-evaluation methods: scalar categorical temporal-difference learning in the Cram\'er geometry and multivariate signed-categorical temporal-difference learning in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.