Algorithms for learning value-aligned policies considering admissibility   relaxation

Andr\'es Holgado-S\'anchez; Joaqu\'in Arias; Holger Billhardt; and Sascha Ossowski

arXiv:2406.04838·cs.AI·June 10, 2024

Algorithms for learning value-aligned policies considering admissibility relaxation

Andr\'es Holgado-S\'anchez, Joaqu\'in Arias, Holger Billhardt, and Sascha Ossowski

PDF

TL;DR

This paper introduces two constrained reinforcement learning algorithms, psilon-ADQL and psilon-CADQL, designed to efficiently learn value-aligned policies with relaxed admissibility constraints, validated on a water distribution scenario.

Contribution

It presents novel algorithms for learning value-aligned policies with admissibility relaxation, addressing increased complexity with reinforcement learning techniques.

Findings

01

Algorithms are effective in a water distribution drought scenario.

02

Proposed methods handle relaxed admissibility constraints efficiently.

03

Validated algorithms outperform baseline approaches.

Abstract

The emerging field of \emph{value awareness engineering} claims that software agents and systems should be value-aware, i.e. they must make decisions in accordance with human values. In this context, such agents must be capable of explicitly reasoning as to how far different courses of action are aligned with these values. For this purpose, values are often modelled as preferences over states or actions, which are then aggregated to determine the sequences of actions that are maximally aligned with a certain value. Recently, additional value admissibility constraints at this level have been considered as well. However, often relaxed versions of these constraints are needed, and this increases considerably the complexity of computing value-aligned policies. To obtain efficient algorithms that make value-aligned decisions considering admissibility relaxation, we propose the use of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.