Scalable Deep Reinforcement Learning for Ride-Hailing

Jiekun Feng; Mark Gluzman; J. G. Dai

arXiv:2009.14679·math.OC·April 30, 2021

Scalable Deep Reinforcement Learning for Ride-Hailing

Jiekun Feng, Mark Gluzman, J. G. Dai

PDF

TL;DR

This paper introduces a scalable deep reinforcement learning approach for ride-hailing services by decomposing the action space, enabling efficient control of many agents, and demonstrating its effectiveness with real-world data.

Contribution

A novel action decomposition method for MDPs in ride-hailing, allowing deep RL to scale to many agents and improve control policy optimization.

Findings

01

Decomposition improves scalability of RL in ride-hailing.

02

Method effective on real Didi Chuxing data.

03

Enhanced control policy performance.

Abstract

Ride-hailing services, such as Didi Chuxing, Lyft, and Uber, arrange thousands of cars to meet ride requests throughout the day. We consider a Markov decision process (MDP) model of a ride-hailing service system, framing it as a reinforcement learning (RL) problem. The simultaneous control of many agents (cars) presents a challenge for the MDP optimization because the action space grows exponentially with the number of cars. We propose a special decomposition for the MDP actions by sequentially assigning tasks to the drivers. The new actions structure resolves the scalability problem and enables the use of deep RL algorithms for control policy optimization. We demonstrate the benefit of our proposed decomposition with a numerical experiment based on real data from Didi Chuxing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.