Zero-shot Generalization in Inventory Management: Train, then Estimate and Decide

Tarkan Temiz\"oz; Christina Imdahl; Remco Dijkman; Douniel Lamghari-Idrissi; Willem van Jaarsveld

arXiv:2411.00515·cs.LG·January 21, 2026·2 cites

Zero-shot Generalization in Inventory Management: Train, then Estimate and Decide

Tarkan Temiz\"oz, Christina Imdahl, Remco Dijkman, Douniel Lamghari-Idrissi, Willem van Jaarsveld

PDF

Open Access

TL;DR

This paper introduces a unifying framework and a generalizable deep reinforcement learning policy for inventory management that performs well on unseen problem instances with unknown parameters, using a three-phase Train, Estimate, and Decide approach.

Contribution

It proposes the TED framework and GC-LSN policy for zero-shot generalization in inventory management, enabling effective decision-making under parameter uncertainty without retraining.

Findings

01

GC-LSN outperforms traditional policies when parameters are known.

02

GC-LSN combined with Kaplan-Meier estimator shows superior empirical results under uncertainty.

03

The framework effectively handles diverse inventory challenges with unknown demand and lead times.

Abstract

Deploying deep reinforcement learning (DRL) in real-world inventory management presents challenges, including dynamic environments and uncertain problem parameters, e.g. demand and lead time distributions. These challenges highlight a research gap, suggesting a need for a unifying framework to model and solve sequential decision-making under parameter uncertainty. We address this by exploring an underexplored area of DRL for inventory management: training generally capable agents (GCAs) under zero-shot generalization (ZSG). Here, GCAs are advanced DRL policies designed to handle a broad range of sampled problem instances with diverse inventory challenges. ZSG refers to the ability to successfully apply learned policies to unseen instances with unknown parameters without retraining. We propose a unifying Super-Markov Decision Process formulation and the Train, then Estimate and Decide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications