A Pseudo-Gradient Approach for Model-free Markov Chain Optimization

Nanne A. Dieleman; Joost Berkhout; Bernd Heidergott

arXiv:2407.14786·math.OC·July 23, 2024·1 cites

A Pseudo-Gradient Approach for Model-free Markov Chain Optimization

Nanne A. Dieleman, Joost Berkhout, Bernd Heidergott

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model-free pseudo-gradient method called SM-SPSA for optimizing functions over the stationary distribution of Markov chains, demonstrating improved scalability and convergence in large and real-world web-graph problems.

Contribution

The paper develops a novel stochastic matrix SPSA algorithm for Markov chain optimization that handles hard constraints via transformations and introduces heuristics to improve convergence.

Findings

01

SM-SPSA scales better than traditional solvers on large problems

02

The method effectively maximizes web-page rankings in real web-graph data

03

Heuristics reduce infliction points, enhancing convergence

Abstract

We develop a first-order (pseudo-)gradient approach for optimizing functions over the stationary distribution of discrete-time Markov chains (DTMC). We give insights into why solving this optimization problem is challenging and show how transformations can be used to circumvent the hard constraints inherent in the optimization problem. The optimization framework is model-free since no explicit model of the interdependence of the row elements of the Markov chain transition matrix is required. Upon the transformation we build an extension of Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, called stochastic matrix SPSA (SM-SPSA) to solve the optimization problem. The performance of the SM-SPSA gradient search is compared with a benchmark commercial solver. Numerical examples show that SM-SPSA scales better which makes it the preferred solution method for large problem…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nanned/SM-SPSA
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods · Reinforcement Learning in Robotics