Gamma-Nets: Generalizing Value Estimation over Timescale

Craig Sherstan; Shibhansh Dohare; James MacGlashan; Johannes; G\"unther; Patrick M. Pilarski

arXiv:1911.07794·cs.LG·October 20, 2020

Gamma-Nets: Generalizing Value Estimation over Timescale

Craig Sherstan, Shibhansh Dohare, James MacGlashan, Johannes, G\"unther, Patrick M. Pilarski

PDF

TL;DR

Gamma-Nets enable flexible value estimation across multiple timescales, allowing predictions at arbitrary durations without prior task knowledge, demonstrated across various RL settings including Atari games.

Contribution

Introduces Gamma-Nets, a novel method for generalizing value estimation over timescales by incorporating timescale as an input, facilitating multi-timescale predictions.

Findings

01

Effective in policy evaluation on square wave and robot arm tasks.

02

Maintains high accuracy with minimal cost when predicting multiple timescales.

03

Applicable to deep RL environments like Atari games.

Abstract

We present $Γ$ -nets, a method for generalizing value function estimation over timescale. By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales. As a result, the prediction target for any timescale is available and we are free to train on multiple timescales at each timestep. Here we empirically evaluate $Γ$ -nets in the policy evaluation setting. We first demonstrate the approach on a square wave and then on a robot arm using linear function approximation. Next, we consider the deep reinforcement learning setting using several Atari video games. Our results show that $Γ$ -nets can be effective for predicting arbitrary timescales, with only a small cost in accuracy as compared to learning estimators for fixed timescales. $Γ$ -nets provide a method for compactly making predictions at many timescales without requiring a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.