Gradient-based Learning in State-based Potential Games for Self-Learning   Production Systems

Steve Yuwono; Marlon L\"oppenberg; Dorothea Schwung; Andreas Schwung

arXiv:2406.10015·cs.LG·June 17, 2024·1 cites

Gradient-based Learning in State-based Potential Games for Self-Learning Production Systems

Steve Yuwono, Marlon L\"oppenberg, Dorothea Schwung, Andreas Schwung

PDF

Open Access

TL;DR

This paper introduces gradient-based optimization methods for state-based potential games in self-learning production systems, aiming to improve convergence speed and policy optimality over traditional exploration methods.

Contribution

It presents novel gradient-based approaches for SbPGs, including three estimation variants, and demonstrates their effectiveness in reducing training time and enhancing policy quality.

Findings

01

Reduced training times in the testbed

02

Achieved more optimal policies

03

Validated on a smart production system

Abstract

In this paper, we introduce novel gradient-based optimization methods for state-based potential games (SbPGs) within self-learning distributed production systems. SbPGs are recognised for their efficacy in enabling self-optimizing distributed multi-agent systems and offer a proven convergence guarantee, which facilitates collaborative player efforts towards global objectives. Our study strives to replace conventional ad-hoc random exploration-based learning in SbPGs with contemporary gradient-based approaches, which aim for faster convergence and smoother exploration dynamics, thereby shortening training duration while upholding the efficacy of SbPGs. Moreover, we propose three distinct variants for estimating the objective function of gradient-based learning, each developed to suit the unique characteristics of the systems under consideration. To validate our methodology, we apply it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization

MethodsSelf-Learning