Deontically Constrained Policy Improvement in Reinforcement Learning Agents

Alena Makarova; Houssam Abbas

arXiv:2506.06959·cs.AI·June 10, 2025

Deontically Constrained Policy Improvement in Reinforcement Learning Agents

Alena Makarova, Houssam Abbas

PDF

Open Access

TL;DR

This paper introduces a method for reinforcement learning agents to improve policies while satisfying deontic constraints, enabling ethical or situational considerations to guide decision-making in uncertain environments.

Contribution

It develops a novel policy improvement approach that incorporates deontic logic constraints into RL, ensuring ethical or situational rules are respected during learning.

Findings

01

The method reaches a constrained local maximum of utility.

02

It effectively integrates deontic constraints into policy optimization.

03

Experimental results demonstrate the approach's viability on sample MDPs.

Abstract

Markov Decision Processes (MDPs) are the most common model for decision making under uncertainty in the Machine Learning community. An MDP captures non-determinism, probabilistic uncertainty, and an explicit model of action. A Reinforcement Learning (RL) agent learns to act in an MDP by maximizing a utility function. This paper considers the problem of learning a decision policy that maximizes utility subject to satisfying a constraint expressed in deontic logic. In this setup, the utility captures the agent's mission - such as going quickly from A to B. The deontic formula represents (ethical, social, situational) constraints on how the agent might achieve its mission by prohibiting classes of behaviors. We use the logic of Expected Act Utilitarianism, a probabilistic stit logic that can be interpreted over controlled MDPs. We develop a variation on policy improvement, and show that it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference