Linking PageRank, Time Reversal, and Policy Evaluation

Konstantin Avrachenkov; Lorenzo Gregoris; Nelly Litvak

arXiv:2605.00532·math.OC·May 4, 2026

Linking PageRank, Time Reversal, and Policy Evaluation

Konstantin Avrachenkov, Lorenzo Gregoris, Nelly Litvak

PDF

TL;DR

This paper links policy evaluation in Markov decision processes to PageRank, showing how value functions can be derived from PageRank vectors of time-reversed chains, enabling efficient analysis.

Contribution

It establishes a novel connection between MDP policy evaluation and PageRank, extending the framework to various types of MDPs and providing a decomposition theorem.

Findings

01

Policy evaluation reduces to PageRank problems on chain components.

02

The approach extends to undiscounted MDPs with terminal states.

03

Numerical examples demonstrate efficiency on large graphs.

Abstract

We establish a connection between policy evaluation in Markov decision processes and PageRank in network analysis. For a fixed policy, we show that the value function of a discounted Markov decision process can be obtained, up to an explicit rescaling, from the PageRank vector of a suitably defined time-reversed Markov chain. In this correspondence, the discount factor plays the role of the teleportation parameter, while rewards induce the restart distribution. Beyond the irreducible case, invoking quasi-stationary distributions and Doob $h$ -transforms, we prove a general decomposition theorem showing that policy evaluation for arbitrary finite MDPs reduces to a collection of PageRank problems on the recurrent and transient components of the policy-induced Markov chain. This framework naturally extends to undiscounted MDPs with terminal states and to transition-dependent rewards. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.