# GQ($\lambda$) Quick Reference and Implementation Guide

**Authors:** Adam White, Richard S. Sutton

arXiv: 1705.03967 · 2017-05-12

## TL;DR

This paper provides a quick reference and implementation guide for the linear GQ(λ) off-policy temporal-difference learning algorithm, including theoretical background and Java code for practical use.

## Contribution

It offers a concise reference and implementation resources for GQ(λ), facilitating understanding and application of this gradient-based off-policy learning algorithm.

## Key findings

- Provides a clear implementation guide for GQ(λ)
- Includes Java code for practical use
- Summarizes key theoretical aspects

## Abstract

This document should serve as a quick reference for and guide to the implementation of linear GQ($\lambda$), a gradient-based off-policy temporal-difference learning algorithm. Explanation of the intuition and theory behind the algorithm are provided elsewhere (e.g., Maei & Sutton 2010, Maei 2011). If you questions or concerns about the content in this document or the attached java code please email Adam White (adam.white@ualberta.ca).   The code is provided as part of the source files in the arXiv submission.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.03967/full.md

---
Source: https://tomesphere.com/paper/1705.03967