Designing a double deep reinforcement learning selection tool for resilient demand prediction
Bilel Abderrahmane Benziane, Benoit Lardeux, Ayoub Mcharek, Maher Jridi

TL;DR
This paper introduces a double deep reinforcement learning architecture for automatic demand forecasting model selection, improving robustness and training efficiency in supply chain applications.
Contribution
A novel double deep reinforcement learning framework for automatic forecasting model selection and an early-stopping method based on reward convergence.
Findings
Demonstrated robustness on grocery and snack demand datasets.
Outperformed state-of-the-art forecasting model selection methods.
Abstract
The use of artificial intelligence in supply chain forecasting has attracted many scientific studies for several decades. However, the process of selecting an appropriate forecasting solution becomes a daunting task. This complexity arises due to the distinct features inherent to each dataset. Research to tackle this issue has been performed since the eighties but recent development of demand forecasting has opened new perspectives. This research aims to enhance automatic forecasting model selection by proposing a novel architecture that acts as a double deep reinforcement learning agent, selecting automatically a forecasting model from the forecasting committee at the time of prediction. Moreover, a novel early-stopping approach based on average reward convergence has been introduced to expedite training time. To evaluate the model's performance, an empirical study was conducted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
