Gaussian-Mixture-Model Q-Functions for Policy Iteration in Reinforcement Learning

Minh Vu; Konstantinos Slavakis

arXiv:2512.18763·cs.LG·December 23, 2025

Gaussian-Mixture-Model Q-Functions for Policy Iteration in Reinforcement Learning

Minh Vu, Konstantinos Slavakis

PDF

Open Access

TL;DR

This paper proposes Gaussian mixture models as direct surrogates for Q-functions in reinforcement learning, enabling efficient policy evaluation with theoretical guarantees and competitive performance without experience data.

Contribution

It introduces GMM-QFs as a novel function approximation method for Q-functions, integrating Riemannian optimization and demonstrating universality and effectiveness.

Findings

01

GMM-QFs are universal approximators for Q-functions.

02

They achieve competitive or superior performance on benchmark RL tasks.

03

They operate efficiently without requiring experience data.

Abstract

Unlike their conventional use as estimators of probability density functions in reinforcement learning (RL), this paper introduces a novel function-approximation role for Gaussian mixture models (GMMs) as direct surrogates for Q-function losses. These parametric models, termed GMM-QFs, possess substantial representational capacity, as they are shown to be universal approximators over a broad class of functions. They are further embedded within Bellman residuals, where their learnable parameters -- a fixed number of mixing weights, together with Gaussian mean vectors and covariance matrices -- are inferred from data via optimization on a Riemannian manifold. This geometric perspective on the parameter space naturally incorporates Riemannian optimization into the policy-evaluation step of standard policy-iteration frameworks. Rigorous theoretical results are established, and supporting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Gaussian Processes and Bayesian Inference