Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning
Th\'eophile Champion, Marek Grze\'s, Howard Bowman

TL;DR
This paper introduces a temporal Gaussian Mixture Model for model-based reinforcement learning that learns environment structure and transitions, enabling efficient decision making and maze navigation.
Contribution
The paper presents a novel temporal Gaussian Mixture Model with structure learning capabilities for perception and transition modeling in reinforcement learning.
Findings
Successfully learned maze structures and state transitions.
Discovered the number of states and transition probabilities.
Enabled effective navigation using learned Q-values.
Abstract
Model-based reinforcement learning refers to a set of approaches capable of sample-efficient decision making, which create an explicit model of the environment. This model can subsequently be used for learning optimal policies. In this paper, we propose a temporal Gaussian Mixture Model composed of a perception model and a transition model. The perception model extracts discrete (latent) states from continuous observations using a variational Gaussian mixture likelihood. Importantly, our model constantly monitors the collected data searching for new Gaussian components, i.e., the perception model performs a form of structure learning (Smith et al., 2020; Friston et al., 2018; Neacsu et al., 2022) as it learns the number of Gaussian components in the mixture. Additionally, the transition model learns the temporal transition between consecutive time steps by taking advantage of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications
MethodsQ-Learning · Sparse Evolutionary Training
