Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions

Jingyang You; Hanna Kurniawati

arXiv:2512.20974·cs.LG·May 11, 2026

Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions

Jingyang You, Hanna Kurniawati

PDF

TL;DR

GLiBRL introduces a fully tractable Bayesian approach with learnable basis functions for deep Bayesian RL, improving task representation and performance on benchmarks.

Contribution

It presents a novel GLiBRL method that enables exact Bayesian inference and establishes a structural link between task representations and kernel methods.

Findings

01

Achieves up to 1.8× performance improvement on MuJoCo and MetaWorld benchmarks.

02

Provides a closed-form relationship between task representation distance and kernel similarity.

03

Enables seamless integration with on-policy and off-policy RL algorithms.

Abstract

Bayesian Reinforcement Learning (BRL), a subclass of Meta-Reinforcement Learning (Meta-RL), provides a principled framework for generalisation by explicitly incorporating Bayesian task parameters into transition and reward models. However, classical BRL methods assume known forms of transition and reward models. While recent deep BRL methods incorporate model learning to address this, applying neural networks directly to joint data and task parameters necessitates variational inference. This often yields indistinct task representations, compromising the resulting BRL policies. To overcome these limitations, we introduce Generalised Linear Models in Deep Bayesian RL with Learnable Basis Functions (GLiBRL). Our approach features fully tractable Bayesian inference over task parameters and model noise, alongside exact marginal likelihood evaluation for learning transition and reward models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.