Online Policy Selection for Inventory Problems
Massil Hihat, Adeline Fermanian

TL;DR
This paper introduces GAPSI, a novel online learning algorithm for complex inventory management, effectively handling non-differentiability and incorporating multiple real-world constraints to optimize replenishment decisions.
Contribution
It proposes a new feature-enhanced base-stock policy and an algorithm that addresses non-differentiability in complex inventory systems, advancing online policy selection methods.
Findings
GAPSI performs well on real-world data in complex inventory scenarios.
The algorithm effectively manages non-differentiability issues.
Numerical simulations demonstrate improved decision-making in multi-product, constrained environments.
Abstract
We tackle online inventory problems where at each time period the manager makes a replenishment decision based on partial historical information in order to meet demands and minimize costs. To solve such problems, we build upon recent works in online learning and control, use insights from inventory theory and propose a new algorithm called GAPSI. This algorithm follows a new feature-enhanced base-stock policy and deals with the troublesome question of non-differentiability which occurs in inventory problems. Our method is illustrated in the context of a complex and novel inventory system involving multiple products, lost sales, perishability, warehouse-capacity constraints and lead times. Extensive numerical simulations are conducted to demonstrate the good performances of our algorithm on real-world data.
Peer Reviews
Decision·Submitted to ICLR 2025
The approach of applying OPS on inventory problems is creative. The idea of linear base stock policies is a nice concept despite being used in many other applications. Observations about non-differentiable points are very informative and nicely show why this aspect should not be neglected.
A lot of the material in the main body (and the appendix) is standard and should be omitted. The entire Section 2.2 is well known in the operations management and operations research communities. Consideration of base-stock policies is questionable. Safety stock (s,S) policies are much more common in practice and nicely capture fixed costs. I read with enthusiasm the arguments and illustrations why non-differentiable points are important. However how to copy with this issue left me empty han
The paper is very well-written, and the authors' approach is easy to follow. The authors successfully apply online policy selection into inventory control, where the loss function, policy selection function and transition function are all nonsmooth. The authors explain in detail why they need to adopt customized differentiation instead of auto differentiation to make the online algorithm work.
The methodological contribution is a bit limited to the ML community. For more details, see the questions below.
1. The article introduces GAPSI, an algorithm that integrates the GAPS[1] method for the inventory control problems, providing a sophisticated framework to tackle the real-world inventory management. 2. The author explain how common industry constraints, including perishability, lead times, and warehouse capacity limitations, can be effectively formulated as mathematical models within GAPS[1] framework in section 2.2. 3. The algorithm's performance is rigorously tested through extensive numeri
1. After meticulously reading this article, I feel that the biggest issue of this paper is that it merely provides a detailed implementation guide (such as policy settings, loss function selection, and modeling details) and an empirical simulation evaluation of how the GAPS in article [1] can be applied to real inventory replenishment management. 2. Due to a series of problems like 'policies, losses, and transitions are not differentiable, thus neither classical chain rules nor smoothness appl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Auction Theory and Applications · Efficiency Analysis Using DEA
