Reinforcement Learning for AMR Charging Decisions: The Impact of Reward and Action Space Design

Janik Bischoff; Alexandru Rinciog; Anne Meyer

arXiv:2505.11136·cs.AI·May 19, 2025

Reinforcement Learning for AMR Charging Decisions: The Impact of Reward and Action Space Design

Janik Bischoff, Alexandru Rinciog, Anne Meyer

PDF

Open Access

TL;DR

This paper explores how different reward and action space designs in reinforcement learning affect autonomous robot charging strategies in warehouses, highlighting trade-offs between flexibility, stability, and generalization.

Contribution

It introduces a new RL design for charging decisions, extends the SLAPStack simulation framework, and evaluates various configurations with adaptive heuristics and PPO.

Findings

01

Flexible RL approaches outperform heuristic baselines in service times.

02

Open-ended designs discover better strategies but need longer to converge.

03

Guided configurations offer stability but limited generalization.

Abstract

We propose a novel reinforcement learning (RL) design to optimize the charging strategy for autonomous mobile robots in large-scale block stacking warehouses. RL design involves a wide array of choices that can mostly only be evaluated through lengthy experimentation. Our study focuses on how different reward and action space configurations, ranging from flexible setups to more guided, domain-informed design configurations, affect the agent performance. Using heuristic charging strategies as a baseline, we demonstrate the superiority of flexible, RL-based approaches in terms of service times. Furthermore, our findings highlight a trade-off: While more open-ended designs are able to discover well-performing strategies on their own, they may require longer convergence times and are less stable, whereas guided configurations lead to a more stable learning process but display a more limited…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Manufacturing and Logistics Optimization · Distributed Control Multi-Agent Systems · Robotic Path Planning Algorithms

Methodstravel james · Focus