Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning

Ege C. Kaya; Aliasghar Pourghani; Vijay Gupta; Abolfazl Hashemi

arXiv:2605.11289·cs.LG·May 13, 2026

Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning

Ege C. Kaya, Aliasghar Pourghani, Vijay Gupta, Abolfazl Hashemi

PDF

TL;DR

This paper introduces a quotient-space formulation for average-reward distributional reinforcement learning, addressing the ill-posedness of direct distributional analogues on the real line and establishing convergence properties.

Contribution

It proposes a quotient-categorical approach that respects symmetry in average-reward RL, with fixed points and convergence guarantees for the associated operators and recursions.

Findings

01

The quotient-space formulation is well-defined and non-expansive.

02

Sampled recursions converge almost surely under certain conditions.

03

Gain estimation can be integrated with convergence guarantees.

Abstract

Average-reward reinforcement learning requires estimating the gain and the bias, which is defined only up to an additive constant. This makes direct distributional analogues ill-posed on the real line. We introduce a quotient-space formulation in which state-indexed bias laws are identified up to a common translation, together with a categorical parameterization that respects this symmetry. On this quotient-categorical space, we define a projected average-reward distributional operator and show that it is well-defined, non-expansive in a coordinate Cram\'er metric, and admits fixed points. We then study sampled recursions whose mean-field maps are asynchronous relaxations of this operator. In an idealized centered-reward setting, a one-state temporal-difference update enjoys almost sure convergence together with finite-iteration residual bounds under both i.i.d. and Markovian sampling.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.