Algorithms for learning value-aligned policies considering admissibility relaxation
Andr\'es Holgado-S\'anchez, Joaqu\'in Arias, Holger Billhardt, and Sascha Ossowski

TL;DR
This paper introduces two constrained reinforcement learning algorithms, psilon-ADQL and psilon-CADQL, designed to efficiently learn value-aligned policies with relaxed admissibility constraints, validated on a water distribution scenario.
Contribution
It presents novel algorithms for learning value-aligned policies with admissibility relaxation, addressing increased complexity with reinforcement learning techniques.
Findings
Algorithms are effective in a water distribution drought scenario.
Proposed methods handle relaxed admissibility constraints efficiently.
Validated algorithms outperform baseline approaches.
Abstract
The emerging field of \emph{value awareness engineering} claims that software agents and systems should be value-aware, i.e. they must make decisions in accordance with human values. In this context, such agents must be capable of explicitly reasoning as to how far different courses of action are aligned with these values. For this purpose, values are often modelled as preferences over states or actions, which are then aggregated to determine the sequences of actions that are maximally aligned with a certain value. Recently, additional value admissibility constraints at this level have been considered as well. However, often relaxed versions of these constraints are needed, and this increases considerably the complexity of computing value-aligned policies. To obtain efficient algorithms that make value-aligned decisions considering admissibility relaxation, we propose the use of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
