On the function approximation error for risk-sensitive reinforcement   learning

Prasenjit Karmakar; Shalabh Bhatnagar

arXiv:1612.07562·cs.LG·October 23, 2019

On the function approximation error for risk-sensitive reinforcement learning

Prasenjit Karmakar, Shalabh Bhatnagar

PDF

Open Access

TL;DR

This paper derives new error bounds for risk-sensitive policy evaluation in reinforcement learning by leveraging Markov chain irreducibility and Perron-Frobenius theory, improving upon previous spectral bounds.

Contribution

It introduces novel bounds based on irreducibility and Perron-Frobenius eigenvectors, offering tighter error estimates than prior spectral variation bounds.

Findings

01

Bounds are tight and match actual errors in examples

02

New bounds outperform previous spectral bounds in large state spaces

03

Provides eigenvalue comparison bounds for irreducible matrices

Abstract

In this paper we obtain several informative error bounds on function approximation for the policy evaluation algorithm proposed by Basu et al. when the aim is to find the risk-sensitive cost represented using exponential utility. The main idea is to use classical Bapat's inequality and to use Perron-Frobenius eigenvectors (exists if we assume irreducible Markov chain) to get the new bounds. The novelty of our approach is that we use the irreduciblity of Markov chain to get the new bounds whereas the earlier work by Basu et al. used spectral variation bound which is true for any matrix. We also give examples where all our bounds achieve the "actual error" whereas the earlier bound given by Basu et al. is much weaker in comparison. We show that this happens due to the absence of difference term in the earlier bound which is always present in all our bounds when the state space is large.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLow-power high-performance VLSI design · Reinforcement Learning in Robotics · Machine Learning and Algorithms