Formal Ethical Obligations in Reinforcement Learning Agents:   Verification and Policy Updates

Colin Shea-Blymyer; Houssam Abbas

arXiv:2408.00147·cs.AI·August 2, 2024

Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates

Colin Shea-Blymyer, Houssam Abbas

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new logical framework for specifying and verifying ethical obligations in reinforcement learning agents, enabling transparent policy modifications to ensure compliance with social and ethical standards.

Contribution

It proposes a novel deontic logic for reasoning about agent obligations and two algorithms for model-checking and policy updating based on this logic.

Findings

01

Algorithms successfully verify agent obligations in DAC-MDPs and gridworlds.

02

The logical approach enhances transparency compared to reward-based methods.

03

Policy modifications effectively align agent behavior with specified obligations.

Abstract

When designing agents for operation in uncertain environments, designers need tools to automatically reason about what agents ought to do, how that conflicts with what is actually happening, and how a policy might be modified to remove the conflict. These obligations include ethical and social obligations, permissions and prohibitions, which constrain how the agent achieves its mission and executes its policy. We propose a new deontic logic, Expected Act Utilitarian deontic logic, for enabling this reasoning at design time: for specifying and verifying the agent's strategic obligations, then modifying its policy from a reference policy to meet those obligations. Unlike approaches that work at the reward level, working at the logical level increases the transparency of the trade-offs. We introduce two algorithms: one for model-checking whether an RL agent has the right strategic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sabotagelab/formal-ethical-obligations
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Blockchain Technology Applications and Security · Ethics and Social Impacts of AI