Learning to Staff: Offline Reinforcement Learning and Fine-Tuned LLMs for Warehouse Staffing Optimization

Kalle Kujanp\"a\"a; Yuying Zhu; Kristina Klinkner; Shervin Malmasi

arXiv:2603.24883·cs.LG·March 27, 2026

Learning to Staff: Offline Reinforcement Learning and Fine-Tuned LLMs for Warehouse Staffing Optimization

Kalle Kujanp\"a\"a, Yuying Zhu, Kristina Klinkner, Shervin Malmasi

PDF

Open Access

TL;DR

This paper explores offline reinforcement learning and fine-tuned large language models to optimize warehouse staffing, demonstrating improvements in throughput and decision-making support in simulated environments.

Contribution

It introduces two novel approaches—Transformer-based offline RL policies and fine-tuned LLMs—for warehouse staffing optimization, comparing their effectiveness and practical applicability.

Findings

01

Offline RL achieved 2.4% throughput improvement.

02

Fine-tuned LLMs matched or exceeded baseline performance.

03

Prompting alone was insufficient for effective decision-making.

Abstract

We investigate machine learning approaches for optimizing real-time staffing decisions in semi-automated warehouse sortation systems. Operational decision-making can be supported at different levels of abstraction, with different trade-offs. We evaluate two approaches, each in a matching simulation environment. First, we train custom Transformer-based policies using offline reinforcement learning on detailed historical state representations, achieving a 2.4% throughput improvement over historical baselines in learned simulators. In high-volume warehouse operations, improvements of this size translate to significant savings. Second, we explore LLMs operating on abstracted, human-readable state descriptions. These are a natural fit for decisions that warehouse managers make using high-level operational summaries. We systematically compare prompting techniques, automatic prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Manufacturing and Logistics Optimization · Scheduling and Optimization Algorithms · Simulation Techniques and Applications