On the policy improvement algorithm for ergodic risk-sensitive control
Ari Arapostathis, Anup Biswas, and Somnath Pradhan

TL;DR
This paper proves the convergence of the policy improvement algorithm for ergodic risk-sensitive control problems involving multidimensional diffusions, under certain stability and cost assumptions, and explores the algorithm's region of attraction.
Contribution
It establishes the convergence of the policy improvement algorithm for a broad class of ergodic risk-sensitive control models with new stability and cost conditions.
Findings
Convergence of the policy improvement algorithm is proven for these control problems.
The region of attraction of the algorithm's equilibrium is characterized.
Results apply to multidimensional controlled diffusions on the whole space.
Abstract
In this article we consider the ergodic risk-sensitive control problem for a large class of multidimensional controlled diffusions on the whole space. We study the minimization and maximization problems under either a blanket stability hypothesis, or a near-monotone assumption on the running cost. We establish the convergence of the policy improvement algorithm for these models. We also present a more general result concerning the region of attraction of the equilibrium of the algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
