Off-line approximate dynamic programming for the vehicle routing problem   with a highly variable customer basis and stochastic demands

Mohsen Dastpak; Fausto Errico; Ola Jabali

arXiv:2109.10200·math.OC·July 15, 2022

Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands

Mohsen Dastpak, Fausto Errico, Ola Jabali

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning approach to solve a complex stochastic vehicle routing problem with variable customers and demands, outperforming benchmarks and competing with specialized methods.

Contribution

It develops a partially decentralized MDP formulation and a Q-learning algorithm, DecQN, to efficiently address the VRP-VCSD with high variability and stochasticity.

Findings

01

DecQN outperforms three benchmark policies.

02

The approach competes with specialized methods with known data.

03

DecQN significantly improves expected served demands.

Abstract

We study a stochastic variant of the vehicle routing problem arising in the context of domestic donor collection services. The problem we consider combines the following attributes. Customers requesting services are variable, in the sense that the customers are stochastic but are not restricted to a predefined set, as they may appear anywhere in a given service area. Furthermore, demand volumes are stochastic and observed upon visiting the customer. The objective is to maximize the expected served demands while meeting vehicle capacity and time restrictions. We call this problem the VRP with a highly Variable Customer basis and Stochastic Demands (VRP-VCSD). For this problem, we first propose a Markov Decision Process (MDP) formulation representing the classical centralized decision-making perspective where one decision-maker establishes the routes of all vehicles. While the resulting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Transportation Planning and Optimization · Vehicle Routing Optimization Methods

Methodstravel james · Q-Learning