Multi-Objective Reinforcement Learning for Large-Scale Tote Allocation in Human-Robot Collaborative Fulfillment Centers

Sikata Sengupta; Guangyi Liu; Omer Gottesman; Joseph W Durham; Michael Kearns; Aaron Roth; Michael Caldara

arXiv:2602.24182·cs.LG·March 2, 2026

Multi-Objective Reinforcement Learning for Large-Scale Tote Allocation in Human-Robot Collaborative Fulfillment Centers

Sikata Sengupta, Guangyi Liu, Omer Gottesman, Joseph W Durham, Michael Kearns, Aaron Roth, Michael Caldara

PDF

Open Access

TL;DR

This paper presents a multi-objective reinforcement learning approach for optimizing tote allocation in large-scale human-robot fulfillment centers, balancing speed, resource use, and space constraints.

Contribution

It introduces a novel MORL method based on game-theoretic dynamics for complex warehouse optimization with high-dimensional states.

Findings

01

Effective trade-offs among objectives demonstrated in simulations.

02

Single policy satisfying multiple constraints empirically learned.

03

Theoretical framework for error cancellation in oscillatory solutions.

Abstract

Optimizing the consolidation process in container-based fulfillment centers requires trading off competing objectives such as processing speed, resource usage, and space utilization while adhering to a range of real-world operational constraints. This process involves moving items between containers via a combination of human and robotic workstations to free up space for inbound inventory and increase container utilization. We formulate this problem as a large-scale Multi-Objective Reinforcement Learning (MORL) task with high-dimensional state spaces and dynamic system behavior. Our method builds on recent theoretical advances in solving constrained RL problems via best-response and no-regret dynamics in zero-sum games, enabling principled minimax policy learning. Policy evaluation on realistic warehouse simulations shows that our approach effectively trades off objectives, and we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Reinforcement Learning in Robotics · Scheduling and Optimization Algorithms