Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control
Tom Maus, Asma Atamna, Tobias Glasmachers

TL;DR
This paper introduces a new industrial control benchmark environment for multi-agent reinforcement learning, comparing modular and monolithic control strategies and emphasizing the importance of action masking for effective learning.
Contribution
It presents an industry-inspired benchmark combining multiple tasks, evaluates control architectures, and analyzes the impact of action masking on learning performance.
Findings
Action masking significantly improves learning performance.
Modular architecture outperforms monolithic without action masking.
Performance gap narrows with action masking, reducing the advantage of specialization.
Abstract
Autonomous control of multi-stage industrial processes requires both local specialization and global coordination. Reinforcement learning (RL) offers a promising approach, but its industrial adoption remains limited due to challenges such as reward design, modularity, and action space management. Many academic benchmarks differ markedly from industrial control problems, limiting their transferability to real-world applications. This study introduces an enhanced industry-inspired benchmark environment that combines tasks from two existing benchmarks, SortingEnv and ContainerGym, into a sequential recycling scenario with sorting and pressing operations. We evaluate two control strategies: a modular architecture with specialized agents and a monolithic agent governing the full system, while also analyzing the impact of action masking. Our experiments show that without action masking,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Flexible and Reconfigurable Manufacturing Systems · Reinforcement Learning in Robotics
