Analytically Tractable Bayesian Deep Q-Learning
Luong Ha, Nguyen, James-A. Goulet

TL;DR
This paper introduces an analytically tractable Bayesian deep Q-learning method that enables closed-form inference for neural network weights, achieving competitive performance without gradient-based optimization.
Contribution
It adapts the temporal difference Q-learning framework to incorporate TAGI, allowing analytical inference and reducing reliance on hyperparameters and gradient-based methods.
Findings
TAGI achieves performance comparable to traditional methods.
Fewer hyperparameters are needed for training.
The approach scales to complex environments like Atari games.
Abstract
Reinforcement learning (RL) has gained increasing interest since the demonstration it was able to reach human performance on video game benchmarks using deep Q-learning (DQN). The current consensus for training neural networks on such complex environments is to rely on gradient-based optimization. Although alternative Bayesian deep learning methods exist, most of them still rely on gradient-based optimization, and they typically do not scale on benchmarks such as the Atari game environment. Moreover none of these approaches allow performing the analytical inference for the weights and biases defining the neural network. In this paper, we present how we can adapt the temporal difference Q-learning framework to make it compatible with the tractable approximate Gaussian inference (TAGI), which allows learning the parameters of a neural network using a closed-form analytical method.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
MethodsQ-Learning
