ORPR: An OR-Guided Pretrain-then-Reinforce Learning Model for Inventory Management

Lingjie Zhao; Xue Yu; Yongzhi Qi; Hao Hu; Jianshen Zhang; Yingzheng Ma; Shuyu Han; Wei Qi; Zuo-Jun Max Shen

arXiv:2512.19001·cs.AI·January 7, 2026

ORPR: An OR-Guided Pretrain-then-Reinforce Learning Model for Inventory Management

Lingjie Zhao, Xue Yu, Yongzhi Qi, Hao Hu, Jianshen Zhang, Yingzheng Ma, Shuyu Han, Wei Qi, Zuo-Jun Max Shen

PDF

Open Access

TL;DR

This paper introduces a novel OR-Guided Pretrain-then-Reinforce framework that combines simulation-augmented OR models with deep learning and reinforcement learning to improve inventory management, achieving significant real-world operational gains.

Contribution

It presents a new hybrid framework integrating OR and AI through pretraining and reinforcement learning, emphasizing structured guidance and expert-in-the-loop adaptation for supply chain optimization.

Findings

01

Achieved a 5.27-day reduction in inventory turnover time.

02

Increased in-stock rates by 2.29%.

03

Reduced holding costs by 29.95%.

Abstract

As the pursuit of synergy between Artificial Intelligence (AI) and Operations Research (OR) gains momentum in handling complex inventory systems, a critical challenge persists: how to effectively reconcile AI's adaptive perception with OR's structural rigor. To bridge this gap, we propose a novel OR-Guided "Pretrain-then-Reinforce" framework. To provide structured guidance, we propose a simulation-augmented OR model that generates high-quality reference decisions, implicitly capturing complex business constraints and managerial preferences. Leveraging these OR-derived decisions as foundational training labels, we design a domain-informed deep learning foundation model to establish foundational decision-making capabilities, followed by a reinforcement learning (RL) fine-tuning stage. Uniquely, we position RL as a deep alignment mechanism that enables the AI agent to internalize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Forecasting Techniques and Applications · Auction Theory and Applications