Multi-Objective Reinforcement Learning for Large-Scale Tote Allocation in Human-Robot Collaborative Fulfillment Centers
Sikata Sengupta, Guangyi Liu, Omer Gottesman, Joseph W Durham, Michael Kearns, Aaron Roth, Michael Caldara

TL;DR
This paper presents a multi-objective reinforcement learning approach for optimizing tote allocation in large-scale human-robot fulfillment centers, balancing speed, resource use, and space constraints.
Contribution
It introduces a novel MORL method based on game-theoretic dynamics for complex warehouse optimization with high-dimensional states.
Findings
Effective trade-offs among objectives demonstrated in simulations.
Single policy satisfying multiple constraints empirically learned.
Theoretical framework for error cancellation in oscillatory solutions.
Abstract
Optimizing the consolidation process in container-based fulfillment centers requires trading off competing objectives such as processing speed, resource usage, and space utilization while adhering to a range of real-world operational constraints. This process involves moving items between containers via a combination of human and robotic workstations to free up space for inbound inventory and increase container utilization. We formulate this problem as a large-scale Multi-Objective Reinforcement Learning (MORL) task with high-dimensional state spaces and dynamic system behavior. Our method builds on recent theoretical advances in solving constrained RL problems via best-response and no-regret dynamics in zero-sum games, enabling principled minimax policy learning. Policy evaluation on realistic warehouse simulations shows that our approach effectively trades off objectives, and we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Reinforcement Learning in Robotics · Scheduling and Optimization Algorithms
