Variance Adjusted Actor Critic Algorithms

Aviv Tamar; Shie Mannor

arXiv:1310.3697·stat.ML·October 15, 2013·25 cites

Variance Adjusted Actor Critic Algorithms

Aviv Tamar, Shie Mannor

PDF

Open Access

TL;DR

This paper introduces a novel actor-critic framework for Markov Decision Processes that optimizes a variance-adjusted expected return, with proven convergence to local optima using linear function approximation.

Contribution

It extends compatible features to variance-adjusted objectives and provides an episodic actor-critic algorithm with convergence guarantees.

Findings

01

Converges almost surely to local optima

02

Extends compatible features to variance-adjusted setting

03

Demonstrates effectiveness of the proposed algorithm

Abstract

We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Auction Theory and Applications · Advanced Bandit Algorithms Research