Learning safe, constrained policies via imitation learning: Connection to Probabilistic Inference and a Naive Algorithm

George Papadopoulos; George A. Vouros

arXiv:2507.06780·cs.LG·July 10, 2025

Learning safe, constrained policies via imitation learning: Connection to Probabilistic Inference and a Naive Algorithm

George Papadopoulos, George A. Vouros

PDF

Open Access

TL;DR

This paper presents a novel imitation learning approach that learns maximum entropy policies respecting constraints from expert demonstrations, connecting probabilistic inference with reinforcement learning to improve policy safety and compliance.

Contribution

It introduces a new constrained imitation learning method grounded in probabilistic inference, with a dual gradient descent algorithm for stable training and effective constraint adherence.

Findings

01

Successfully learns policies with multiple constraints

02

Handles diverse demonstration modalities

03

Demonstrates good generalization capabilities

Abstract

This article introduces an imitation learning method for learning maximum entropy policies that comply with constraints demonstrated by expert trajectories executing a task. The formulation of the method takes advantage of results connecting performance to bounds for the KL-divergence between demonstrated and learned policies, and its objective is rigorously justified through a connection to a probabilistic inference framework for reinforcement learning, incorporating the reinforcement learning objective and the objective to abide by constraints in an entropy maximization setting. The proposed algorithm optimizes the learning objective with dual gradient descent, supporting effective and stable training. Experiments show that the proposed method can learn effective policy models for constraints-abiding behaviour, in settings with multiple constraints of different types, accommodating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Motor Control and Adaptation