Pattern Transfer Learning for Reinforcement Learning in Order Dispatching
Runzhe Wan, Sheng Zhang, Chengchun Shi, Shikai Luo, Rui Song

TL;DR
This paper introduces a pattern transfer learning framework for reinforcement learning in order dispatching, leveraging stable value relationships across environments to improve performance amid demand-supply non-stationarity.
Contribution
It proposes a novel transfer learning approach that captures stable value patterns using a concordance penalty, enhancing RL in dynamic order dispatch scenarios.
Findings
The method outperforms existing RL approaches in experiments.
Stable value relationships can be effectively transferred across environments.
The approach improves robustness to demand-supply fluctuations.
Abstract
Order dispatch is one of the central problems to ride-sharing platforms. Recently, value-based reinforcement learning algorithms have shown promising performance on this problem. However, in real-world applications, the non-stationarity of the demand-supply system poses challenges to re-utilizing data generated in different time periods to learn the value function. In this work, motivated by the fact that the relative relationship between the values of some states is largely stable across various environments, we propose a pattern transfer learning framework for value-based reinforcement learning in the order dispatch problem. Our method efficiently captures the value patterns by incorporating a concordance penalty. The superior performance of the proposed method is supported by experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Electric Vehicles and Infrastructure · Sharing Economy and Platforms
