Reinforcement Learning with Intrinsically Motivated Feedback Graph for   Lost-sales Inventory Control

Zifan Liu; Xinran Li; Shibo Chen; Gen Li; Jiashuo Jiang; Jun Zhang

arXiv:2406.18351·cs.LG·February 18, 2025

Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control

Zifan Liu, Xinran Li, Shibo Chen, Gen Li, Jiashuo Jiang, Jun Zhang

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning framework with a feedback graph and intrinsic motivation to improve sample efficiency in lost-sales inventory control, addressing challenges of costly online experience and demand uncertainty.

Contribution

It designs a specialized feedback graph for lost-sales IC problems and develops an intrinsic reward mechanism, significantly enhancing RL sample efficiency in this domain.

Findings

01

Enhanced sample efficiency demonstrated in experiments.

02

Theoretical analysis confirms reduced sample complexity.

03

Method outperforms baseline RL approaches in inventory control tasks.

Abstract

Reinforcement learning (RL) has proven to be well-performed and general-purpose in the inventory control (IC). However, further improvement of RL algorithms in the IC domain is impeded due to two limitations of online experience. First, online experience is expensive to acquire in real-world applications. With the low sample efficiency nature of RL algorithms, it would take extensive time to train the RL policy to convergence. Second, online experience may not reflect the true demand due to the lost sales phenomenon typical in IC, which makes the learning process more challenging. To address the above challenges, we propose a decision framework that combines reinforcement learning with feedback graph (RLFG) and intrinsically motivated exploration (IME) to boost sample efficiency. In particular, we first take advantage of the inherent properties of lost-sales IC problems and design the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSupply Chain and Inventory Management · Advanced Queuing Theory Analysis · Blockchain Technology Applications and Security