SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement   Learning

Fabian Ritz; Thomy Phan; Robert M\"uller; Thomas Gabor; Andreas; Sedlmeier; Marc Zeller; Jan Wieghardt; Reiner Schmid; Horst Sauer; Cornel; Klein; Claudia Linnhoff-Popien

arXiv:2012.07949·cs.LG·April 12, 2021

SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement Learning

Fabian Ritz, Thomy Phan, Robert M\"uller, Thomas Gabor, Andreas, Sedlmeier, Marc Zeller, Jan Wieghardt, Reiner Schmid, Horst Sauer, Cornel, Klein, Claudia Linnhoff-Popien

PDF

TL;DR

This paper introduces SAT-MARL, a method for training multi-agent reinforcement learning systems that explicitly incorporates specifications and constraints to ensure predictable and compliant behavior in industrial environments.

Contribution

It proposes a novel approach to embed functional and non-functional requirements into reward shaping for multi-agent reinforcement learning.

Findings

01

Agents can learn to comply with specifications

02

Improved predictability and safety in multi-agent systems

03

Effective across various algorithms and scenarios

Abstract

A characteristic of reinforcement learning is the ability to develop unforeseen strategies when solving problems. While such strategies sometimes yield superior performance, they may also result in undesired or even dangerous behavior. In industrial scenarios, a system's behavior also needs to be predictable and lie within defined ranges. To enable the agents to learn (how) to align with a given specification, this paper proposes to explicitly transfer functional and non-functional requirements into shaped rewards. Experiments are carried out on the smart factory, a multi-agent environment modeling an industrial lot-size-one production facility, with up to eight agents and different multi-agent reinforcement learning algorithms. Results indicate that compliance with functional and non-functional constraints can be achieved by the proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.