Universal Reinforcement Learning in Coalgebras: Asynchronous Stochastic Computation via Conduction
Sridhar Mahadevan

TL;DR
This paper generalizes reinforcement learning using category theory and coalgebra, enabling asynchronous distributed computation and broadening the mathematical framework for dynamic system models.
Contribution
It introduces a categorial framework for universal reinforcement learning, connecting coinduction, coalgebras, and topos theory to model asynchronous distributed algorithms.
Findings
Models algorithms as functor categories within a topos.
Relates metric coinduction to asynchronous convergence.
Extends RL models to universal coalgebras for dynamic systems.
Abstract
In this paper, we introduce a categorial generalization of RL, termed universal reinforcement learning (URL), building on powerful mathematical abstractions from the study of coinduction on non-well-founded sets and universal coalgebras, topos theory, and categorial models of asynchronous parallel distributed computation. In the first half of the paper, we review the basic RL framework, illustrate the use of categories and functors in RL, showing how they lead to interesting insights. In particular, we also introduce a standard model of asynchronous distributed minimization proposed by Bertsekas and Tsitsiklis, and describe the relationship between metric coinduction and their proof of the Asynchronous Convergence Theorem. The space of algorithms for MDPs or PSRs can be modeled as a functor category, where the co-domain category forms a topos, which admits all (co)limits, possesses a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Constraint Satisfaction and Optimization · Complexity and Algorithms in Graphs
