Atomic Proximal Policy Optimization for Electric Robo-Taxi Dispatch and Charger Allocation
Jim Dai, Manxi Wu, Zhanhao Zhang

TL;DR
This paper introduces Atomic-PPO, a scalable deep reinforcement learning algorithm for optimizing electric robo-taxi dispatch and charging, demonstrating superior performance in real-world NYC data and analyzing charger allocation impacts.
Contribution
The paper presents a novel atomic action decomposition method within PPO to handle large action spaces in robo-taxi dispatch and charging optimization.
Findings
Atomic-PPO outperforms benchmark methods in long-run average reward.
Efficient charger allocation significantly impacts system performance.
Vehicle range and charger speed influence dispatch efficiency and system costs.
Abstract
Pioneering companies such as Waymo have deployed robo-taxi services in several U.S. cities. These robo-taxis are electric vehicles, and their operations require the joint optimization of ride matching, vehicle repositioning, and charging scheduling in a stochastic environment. We model the operations of the ride-hailing system with robo-taxis as a discrete-time, average-reward Markov Decision Process with an infinite horizon. As the fleet size grows, dispatching becomes challenging, as both the system state space and the fleet dispatching action space grow exponentially with the number of vehicles. To address this, we introduce a scalable deep reinforcement learning algorithm, called Atomic Proximal Policy Optimization (Atomic-PPO), that reduces the action space using atomic action decomposition. We evaluate our algorithm using real-world NYC for-hire vehicle trip records and measure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElectric Vehicles and Infrastructure · Transportation and Mobility Innovations · Advanced Battery Technologies Research
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sparse Evolutionary Training
