Multi-Agent Pointer Transformer: Seq-to-Seq Reinforcement Learning for Multi-Vehicle Dynamic Pickup-Delivery Problems
Zengyu Zou, Jingyuan Wang, Yixuan Huang, Junjie Wu

TL;DR
This paper introduces the Multi-Agent Pointer Transformer, a novel reinforcement learning framework that improves decision-making efficiency and effectiveness in complex multi-vehicle pickup and delivery problems with dynamic requests.
Contribution
It proposes a Transformer-based neural network architecture with relation-aware attention and informative priors for joint multi-vehicle decision-making in dynamic routing tasks.
Findings
MAPT outperforms baseline methods in performance metrics.
MAPT achieves faster computation times than classical optimization methods.
The framework effectively models inter-entity relationships in multi-vehicle routing.
Abstract
This paper addresses the cooperative Multi-Vehicle Dynamic Pickup and Delivery Problem with Stochastic Requests (MVDPDPSR) and proposes an end-to-end centralized decision-making framework based on sequence-to-sequence, named Multi-Agent Pointer Transformer (MAPT). MVDPDPSR is an extension of the vehicle routing problem and a spatio-temporal system optimization problem, widely applied in scenarios such as on-demand delivery. Classical operations research methods face bottlenecks in computational complexity and time efficiency when handling large-scale dynamic problems. Although existing reinforcement learning methods have achieved some progress, they still encounter several challenges: 1) Independent decoding across multiple vehicles fails to model joint action distributions; 2) The feature extraction network struggles to capture inter-entity relationships; 3) The joint action space is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVehicle Routing Optimization Methods · Transportation and Mobility Innovations · Traffic control and management
