Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning

Marvin Alles; Philip Becker-Ehmck; Patrick van der Smagt; Maximilian Karl

arXiv:2411.04562·cs.LG·December 16, 2025

Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning

Marvin Alles, Philip Becker-Ehmck, Patrick van der Smagt, Maximilian Karl

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Constrained Latent Action Policies (C-LAP), a model-based offline reinforcement learning method that leverages a generative model to constrain actions within the latent space, improving performance especially with visual data.

Contribution

C-LAP proposes a novel constrained policy learning approach using a generative model of joint observations and actions, reducing reliance on uncertainty penalties and improving efficiency.

Findings

01

C-LAP performs competitively on D4RL benchmarks.

02

C-LAP outperforms existing methods on visual observation datasets.

03

C-LAP requires fewer gradient steps for policy learning.

Abstract

In offline reinforcement learning, a policy is learned using a static dataset in the absence of costly feedback from the environment. In contrast to the online setting, only using static datasets poses additional challenges, such as policies generating out-of-distribution samples. Model-based offline reinforcement learning methods try to overcome these by learning a model of the underlying dynamics of the environment and using it to guide policy search. It is beneficial but, with limited datasets, errors in the model and the issue of value overestimation among out-of-distribution states can worsen performance. Current model-based methods apply some notion of conservatism to the Bellman update, often implemented using uncertainty estimation derived from model ensembles. In this paper, we propose Constrained Latent Action Policies (C-LAP) which learns a generative model of the joint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marvinalles/c-lap
jaxOfficial

Videos

Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics