Multi-Agent Reinforcement Learning for Order-dispatching via Order-Vehicle Distribution Matching
Ming Zhou, Jiarui Jin, Weinan Zhang, Zhiwei Qin, Yan Jiao, Chenxi, Wang, Guobin Wu, Yong Yu, Jieping Ye

TL;DR
This paper introduces a decentralized multi-agent reinforcement learning approach for large-scale order dispatching in ride-hailing systems, improving efficiency without inter-agent communication and balancing supply and demand.
Contribution
It proposes a novel decentralized multi-agent RL method with KL-divergence optimization, eliminating the need for agent communication and enhancing dispatching efficiency.
Findings
Outperforms baselines in driver income and order response rate
Effective in both simulated and real-world environments
Supports deployment on large-scale ride-hailing platforms
Abstract
Improving the efficiency of dispatching orders to vehicles is a research hotspot in online ride-hailing systems. Most of the existing solutions for order-dispatching are centralized controlling, which require to consider all possible matches between available orders and vehicles. For large-scale ride-sharing platforms, there are thousands of vehicles and orders to be matched at every second which is of very high computational cost. In this paper, we propose a decentralized execution order-dispatching method based on multi-agent reinforcement learning to address the large-scale order-dispatching problem. Different from the previous cooperative multi-agent reinforcement learning algorithms, in our method, all agents work independently with the guidance from an evaluation of the joint policy since there is no need for communication or explicit cooperation between agents. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
