Reinforcement Learning for AMR Charging Decisions: The Impact of Reward and Action Space Design
Janik Bischoff, Alexandru Rinciog, Anne Meyer

TL;DR
This paper explores how different reward and action space designs in reinforcement learning affect autonomous robot charging strategies in warehouses, highlighting trade-offs between flexibility, stability, and generalization.
Contribution
It introduces a new RL design for charging decisions, extends the SLAPStack simulation framework, and evaluates various configurations with adaptive heuristics and PPO.
Findings
Flexible RL approaches outperform heuristic baselines in service times.
Open-ended designs discover better strategies but need longer to converge.
Guided configurations offer stability but limited generalization.
Abstract
We propose a novel reinforcement learning (RL) design to optimize the charging strategy for autonomous mobile robots in large-scale block stacking warehouses. RL design involves a wide array of choices that can mostly only be evaluated through lengthy experimentation. Our study focuses on how different reward and action space configurations, ranging from flexible setups to more guided, domain-informed design configurations, affect the agent performance. Using heuristic charging strategies as a baseline, we demonstrate the superiority of flexible, RL-based approaches in terms of service times. Furthermore, our findings highlight a trade-off: While more open-ended designs are able to discover well-performing strategies on their own, they may require longer convergence times and are less stable, whereas guided configurations lead to a more stable learning process but display a more limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Manufacturing and Logistics Optimization · Distributed Control Multi-Agent Systems · Robotic Path Planning Algorithms
Methodstravel james · Focus
