Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands
Mohsen Dastpak, Fausto Errico, Ola Jabali

TL;DR
This paper introduces a novel reinforcement learning approach to solve a complex stochastic vehicle routing problem with variable customers and demands, outperforming benchmarks and competing with specialized methods.
Contribution
It develops a partially decentralized MDP formulation and a Q-learning algorithm, DecQN, to efficiently address the VRP-VCSD with high variability and stochasticity.
Findings
DecQN outperforms three benchmark policies.
The approach competes with specialized methods with known data.
DecQN significantly improves expected served demands.
Abstract
We study a stochastic variant of the vehicle routing problem arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that the customers are stochastic but are not restricted to a predefined set, as they may appear anywhere in a given service area. Furthermore, demand volumes are stochastic and observed upon visiting the customer. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). For this problem, we first propose a Markov Decision Process (MDP) formulation representing the classical centralized decision-making perspective where one decision-maker establishes the routes of all vehicles. While the resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Transportation Planning and Optimization · Vehicle Routing Optimization Methods
Methodstravel james · Q-Learning
