Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

Guillem Rodr\'iguez-Corominas; Maria J. Blesa; Christian Blum

arXiv:2602.23579·cs.AI·March 2, 2026

Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

Guillem Rodr\'iguez-Corominas, Maria J. Blesa, Christian Blum

PDF

Open Access

TL;DR

This paper introduces RL-CMSA, a hybrid reinforcement learning approach for the min-max mTSP, combining probabilistic construction, exact optimization, and adaptive solution refinement to effectively balance workload among multiple salesmen.

Contribution

The paper presents a novel hybrid method integrating reinforcement learning with exact optimization for the min-max mTSP, improving solution quality and scalability over existing algorithms.

Findings

01

RL-CMSA consistently finds near-optimal solutions.

02

Outperforms state-of-the-art hybrid genetic algorithms.

03

Effective on large and complex instances.

Abstract

The Multiple Traveling Salesman Problem (mTSP) extends the Traveling Salesman Problem to m tours that start and end at a common depot and jointly visit all customers exactly once. In the min-max variant, the objective is to minimize the longest tour, reflecting workload balance. We propose a hybrid approach, Construct, Merge, Solve & Adapt with Reinforcement Learning (RL-CMSA), for the symmetric single-depot min-max mTSP. The method iteratively constructs diverse solutions using probabilistic clustering guided by learned pairwise q-values, merges routes into a compact pool, solves a restricted set-covering MILP, and refines solutions via inter-route remove, shift, and swap moves. The q-values are updated by reinforcing city-pair co-occurrences in high-quality solutions, while the pool is adapted through ageing and pruning. This combination of exact optimization and reinforcement-guided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVehicle Routing Optimization Methods · Constraint Satisfaction and Optimization · Metaheuristic Optimization Algorithms Research