Gradient-based Learning in State-based Potential Games for Self-Learning Production Systems
Steve Yuwono, Marlon L\"oppenberg, Dorothea Schwung, Andreas Schwung

TL;DR
This paper introduces gradient-based optimization methods for state-based potential games in self-learning production systems, aiming to improve convergence speed and policy optimality over traditional exploration methods.
Contribution
It presents novel gradient-based approaches for SbPGs, including three estimation variants, and demonstrates their effectiveness in reducing training time and enhancing policy quality.
Findings
Reduced training times in the testbed
Achieved more optimal policies
Validated on a smart production system
Abstract
In this paper, we introduce novel gradient-based optimization methods for state-based potential games (SbPGs) within self-learning distributed production systems. SbPGs are recognised for their efficacy in enabling self-optimizing distributed multi-agent systems and offer a proven convergence guarantee, which facilitates collaborative player efforts towards global objectives. Our study strives to replace conventional ad-hoc random exploration-based learning in SbPGs with contemporary gradient-based approaches, which aim for faster convergence and smoother exploration dynamics, thereby shortening training duration while upholding the efficacy of SbPGs. Moreover, we propose three distinct variants for estimating the objective function of gradient-based learning, each developed to suit the unique characteristics of the systems under consideration. To validate our methodology, we apply it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization
MethodsSelf-Learning
