Adaptive reinforcement learning of multi-agent ethically-aligned behaviours: the QSOM and QDSOM algorithms
R\'emy Chaput, Olivier Boissier, Mathieu Guillermin

TL;DR
This paper introduces QSOM and QDSOM algorithms that enable reinforcement learning agents to adapt to evolving ethical considerations in dynamic environments, demonstrated through a multi-agent energy distribution case.
Contribution
The paper presents novel algorithms combining Q-Table with Self-Organizing Maps to handle changing ethical reward functions in multi-agent reinforcement learning.
Findings
QSOM and QDSOM adapt effectively to changing reward functions.
They outperform baseline RL algorithms in a smart grid energy distribution task.
Algorithms demonstrate robustness in dynamic, multi-dimensional environments.
Abstract
The numerous deployed Artificial Intelligence systems need to be aligned with our ethical considerations. However, such ethical considerations might change as time passes: our society is not fixed, and our social mores evolve. This makes it difficult for these AI systems; in the Machine Ethics field especially, it has remained an under-studied challenge. In this paper, we present two algorithms, named QSOM and QDSOM, which are able to adapt to changes in the environment, and especially in the reward function, which represents the ethical considerations that we want these systems to be aligned with. They associate the well-known Q-Table to (Dynamic) Self-Organizing Maps to handle the continuous and multi-dimensional state and action spaces. We evaluate them on a use-case of multi-agent energy repartition within a small Smart Grid neighborhood, and prove their ability to adapt, and their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Reinforcement Learning in Robotics
