Analytically Tractable Bayesian Deep Q-Learning

Luong Ha; Nguyen; James-A. Goulet

arXiv:2106.11086·cs.LG·June 22, 2021

Analytically Tractable Bayesian Deep Q-Learning

Luong Ha, Nguyen, James-A. Goulet

PDF

Open Access

TL;DR

This paper introduces an analytically tractable Bayesian deep Q-learning method that enables closed-form inference for neural network weights, achieving competitive performance without gradient-based optimization.

Contribution

It adapts the temporal difference Q-learning framework to incorporate TAGI, allowing analytical inference and reducing reliance on hyperparameters and gradient-based methods.

Findings

01

TAGI achieves performance comparable to traditional methods.

02

Fewer hyperparameters are needed for training.

03

The approach scales to complex environments like Atari games.

Abstract

Reinforcement learning (RL) has gained increasing interest since the demonstration it was able to reach human performance on video game benchmarks using deep Q-learning (DQN). The current consensus for training neural networks on such complex environments is to rely on gradient-based optimization. Although alternative Bayesian deep learning methods exist, most of them still rely on gradient-based optimization, and they typically do not scale on benchmarks such as the Atari game environment. Moreover none of these approaches allow performing the analytical inference for the weights and biases defining the neural network. In this paper, we present how we can adapt the temporal difference Q-learning framework to make it compatible with the tractable approximate Gaussian inference (TAGI), which allows learning the parameters of a neural network using a closed-form analytical method.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning

MethodsQ-Learning