Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem
Guillem Rodr\'iguez-Corominas, Maria J. Blesa, Christian Blum

TL;DR
This paper introduces RL-CMSA, a hybrid reinforcement learning approach for the min-max mTSP, combining probabilistic construction, exact optimization, and adaptive solution refinement to effectively balance workload among multiple salesmen.
Contribution
The paper presents a novel hybrid method integrating reinforcement learning with exact optimization for the min-max mTSP, improving solution quality and scalability over existing algorithms.
Findings
RL-CMSA consistently finds near-optimal solutions.
Outperforms state-of-the-art hybrid genetic algorithms.
Effective on large and complex instances.
Abstract
The Multiple Traveling Salesman Problem (mTSP) extends the Traveling Salesman Problem to m tours that start and end at a common depot and jointly visit all customers exactly once. In the min-max variant, the objective is to minimize the longest tour, reflecting workload balance. We propose a hybrid approach, Construct, Merge, Solve & Adapt with Reinforcement Learning (RL-CMSA), for the symmetric single-depot min-max mTSP. The method iteratively constructs diverse solutions using probabilistic clustering guided by learned pairwise q-values, merges routes into a compact pool, solves a restricted set-covering MILP, and refines solutions via inter-route remove, shift, and swap moves. The q-values are updated by reinforcing city-pair co-occurrences in high-quality solutions, while the pool is adapted through ageing and pruning. This combination of exact optimization and reinforcement-guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle Routing Optimization Methods · Constraint Satisfaction and Optimization · Metaheuristic Optimization Algorithms Research
