Quantum Reinforcement Learning with Transformers for the Capacitated Vehicle Routing Problem

Eva Andr\'es

arXiv:2602.05920·cs.AI·February 6, 2026

Quantum Reinforcement Learning with Transformers for the Capacitated Vehicle Routing Problem

Eva Andr\'es

PDF

Open Access

TL;DR

This study compares classical, quantum, and hybrid reinforcement learning models with transformer architectures for solving the Capacitated Vehicle Routing Problem, demonstrating quantum advantages in robustness and solution quality.

Contribution

It introduces a hybrid quantum-classical RL approach with transformer architectures for CVRP, showing improved performance over classical methods.

Findings

01

Quantum models outperform classical in routing quality.

02

Hybrid architecture achieves the best overall performance.

03

Quantum-based models produce more structured routes.

Abstract

This paper addresses the Capacitated Vehicle Routing Problem (CVRP) by comparing classical and quantum Reinforcement Learning (RL) approaches. An Advantage Actor-Critic (A2C) agent is implemented in classical, full quantum, and hybrid variants, integrating transformer architectures to capture the relationships between vehicles, clients, and the depot through self- and cross-attention mechanisms. The experiments focus on multi-vehicle scenarios with capacity constraints, considering 20 clients and 4 vehicles, and are conducted over ten independent runs. Performance is assessed using routing distance, route compactness, and route overlap. The results show that all three approaches are capable of learning effective routing policies. However, quantum-enhanced models outperform the classical baseline and produce more robust route organization, with the hybrid architecture achieving the best…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVehicle Routing Optimization Methods · Traffic control and management · Software-Defined Networks and 5G