RP-DQN: An application of Q-Learning to Vehicle Routing Problems
Ahmad Bdeir, Simon Boeder, Tim Dernedde, Kirill Tkachuk, Jonas K., Falkner, Lars Schmidt-Thieme

TL;DR
This paper introduces RP-DQN, a Q-Learning based method that improves vehicle routing problem solutions, achieving state-of-the-art results on CVRP and pioneering ML application to MDVRP.
Contribution
It presents a novel Q-Learning approach with enhanced state representation for vehicle routing, including the first ML solution for MDVRP.
Findings
Achieves state-of-the-art performance on CVRP
First ML approach successfully applied to MDVRP
Demonstrates significant improvements over previous ML methods
Abstract
In this paper we present a new approach to tackle complex routing problems with an improved state representation that utilizes the model complexity better than previous methods. We enable this by training from temporal differences. Specifically Q-Learning is employed. We show that our approach achieves state-of-the-art performance for autoregressive policies that sequentially insert nodes to construct solutions on the CVRP. Additionally, we are the first to tackle the MDVRP with machine learning methods and demonstrate that this problem type greatly benefits from our approach over other ML methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsQ-Learning
