CityFlow: A Multi-Agent Reinforcement Learning Environment for Large   Scale City Traffic Scenario

Huichu Zhang; Siyuan Feng; Chang Liu; Yaoyao Ding; Yichen Zhu; Zihan; Zhou; Weinan Zhang; Yong Yu; Haiming Jin; Zhenhui Li

arXiv:1905.05217·cs.MA·May 15, 2019

CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan, Zhou, Weinan Zhang, Yong Yu, Haiming Jin, Zhenhui Li

PDF

1 Repo

TL;DR

CityFlow is a scalable, fast, and flexible traffic simulator designed for reinforcement learning research, capable of supporting large-scale city traffic scenarios with improved performance over existing tools like SUMO.

Contribution

The paper introduces CityFlow, a new traffic simulator optimized for large-scale reinforcement learning applications, with significantly enhanced speed and scalability.

Findings

01

CityFlow is over twenty times faster than SUMO.

02

Supports flexible road network and traffic flow definitions.

03

Enables city-wide traffic simulation with real-time monitoring.

Abstract

Traffic signal control is an emerging application scenario for reinforcement learning. Besides being as an important problem that affects people's daily life in commuting, traffic signal control poses its unique challenges for reinforcement learning in terms of adapting to dynamic traffic environment and coordinating thousands of agents including vehicles and pedestrians. A key factor in the success of modern reinforcement learning relies on a good simulator to generate a large number of data samples for learning. The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios. This motivates us to create a new traffic simulator CityFlow with fundamentally optimized data structures and efficient algorithms. CityFlow can support flexible definitions for…

Figures5

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1 . Duration of vehicles under different traffic volume

Vehicles/Hour	100	200	300	400	500
SUMO	40.76	41.57	42.75	44.08	45.93
CityFlow	40.79	41.58	42.62	43.84	45.45
Difference	0.07%	0.04%	0.30%	0.54%	1.06%

Equations5

c

c

a

b

s

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cityflow-project/CityFlow
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

Huichu Zhang

[email protected]

Shanghai Jiao Tong University800 Dongchuan RoadShanghaiChina

,

Siyuan Feng

[email protected]

Shanghai Jiao Tong University800 Dongchuan RoadShanghaiChina

,

Chang Liu

[email protected]

Shanghai Jiao Tong University800 Dongchuan RoadShanghaiChina

,

Yaoyao Ding

[email protected]

Shanghai Jiao Tong University

,

Yichen Zhu

zyc˙[email protected]

Shanghai Jiao Tong University

,

Zihan Zhou

[email protected]

Shanghai Jiao Tong University

,

Weinan Zhang

[email protected]

Shanghai Jiao Tong University

,

Yong Yu

[email protected]

Shanghai Jiao Tong University

,

Haiming Jin

[email protected]

Shanghai Jiao Tong University

and

Zhenhui Li

[email protected]

Pennsylvania State UniversityOld MainState CollegePennsylvaniaUSA

(2019)

Abstract.

Traffic signal control is an emerging application scenario for reinforcement learning. Besides being as an important problem that affects people’s daily life in commuting, traffic signal control poses its unique challenges for reinforcement learning in terms of adapting to dynamic traffic environment and coordinating thousands of agents including vehicles and pedestrians. A key factor in the success of modern reinforcement learning relies on a good simulator to generate a large number of data samples for learning. The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios. This motivates us to create a new traffic simulator CityFlow with fundamentally optimized data structures and efficient algorithms. CityFlow can support flexible definitions for road network and traffic flow based on synthetic and real-world data. It also provides user-friendly interface for reinforcement learning. Most importantly, CityFlow is more than twenty times faster than SUMO and is capable of supporting city-wide traffic simulation with an interactive render for monitoring. Besides traffic signal control, CityFlow could serve as the base for other transportation studies and can create new possibilities to test machine learning methods in the intelligent transportation domain.

Reinforcement Learning Platform; Microscopic Traffic Simulation; Mobility

††journalyear: 2019††copyright: iw3c2w3††conference: Proceedings of the 2019 World Wide Web Conference; May 13–17, 2019; San Francisco, CA, USA††booktitle: Proceedings of the 2019 World Wide Web Conference (WWW ’19), May 13–17, 2019, San Francisco, CA, USA††doi: 10.1145/3308558.3314139††isbn: 978-1-4503-6674-8/19/05††ccs: Computing methodologies Multi-agent systems††ccs: Computing methodologies Simulation environments††ccs: Applied computing Transportation

1. introduction

Traffic signal control problem, one of the biggest urban problems, is drawing increasing attention in recent years (Wei et al., 2018; Li et al., 2016; Van der Pol and Oliehoek, 2016). Recent advances are enabled by large-scale real-time traffic data collected from various sources such as vehicle tracking device, location-based mobile services, and road surveillance cameras through advanced sensing technology and web infrastructure. Traffic signal control is interesting but complex because of the dynamics of traffic flow and the difficulties to coordinate thousands of traffic signals. Reinforcement learning becomes one of the promising approaches to optimize traffic signal plans, as shown in several recent studies (Wei et al., 2018; Li et al., 2016; Van der Pol and Oliehoek, 2016). At the same time, traffic signal control is also one of the major real-world application scenarios for reinforcement learning (Li, 2017).

To successfully deploy reinforcement learning technique for traffic signal control, the traffic simulator becomes the most important factor. Because the learning method relies on a large set of data samples. These data samples can hardly be collected from the real world directly. Aside from the consequence of bad decisions, a city simply cannot generate enough data samples for learning. If we treat each minute as a data sample, a city can only generate 1,440 (24 hours by 60 minutes) data samples in a day. Such a small size of sample is not enough to train a deep reinforcement learning model to be powerful enough to make good decisions. Thus, it becomes crucial to have a simulator that is fast enough to generate a large set of data samples.

The most popular public traffic simulator SUMO (Lopez et al., 2018) (Simulation of Urban Mobility) has been frequently used in many recent studies. SUMO, however, is not scalable to the size of the road network and the size of traffic flow. For example, it can only perform around three simulation steps per second on a $30\times 30$ grid with tens of thousands of vehicles, the situation is even worse if we use the python interface to get information about the system to support reinforcement learning. A city, however, is often at the size of a thousand intersections (e.g. there are $30\times 40$ intersections of major roads in Hangzhou, China) and hundreds of thousands vehicles, which is beyond the current simulation capacity of SUMO.

To enable the reinforcement learning for intelligent transportation, we create a traffic simulator CityFlow 111https://github.com/cityflow-project/CityFlow/, which can be scaled to support the city-wide traffic simulation. One of the major improvements over SUMO is that CityFlow enables multithreading computing. To the best of our knowledge, this is the first open-source simulator that can support city-wide traffic simulator. CityFlow is flexible to define road network, vehicle models, and traffic signal plans. It is more than twenty times faster than SUMO. We have also provided friendly interface for reinforcement learning testbed. We plan to demonstrate these functions at the demo session.

Finally, our scalable traffic simulator CityFlow will open many new possibilities besides traffic signal control scenario. First, it could support various large-scale transportation research studies, such as vehicle routing through mobile app, traffic jam prevention. Second, similar to OpenAI Gym222https://gym.openai.com/ which provides a set of benchmark environments for reinforcement learning, CityFlow could serve as a benchmark reinforcement learning environment for transportation studies. Besides traffic signal control, reinforcement learning has been used in transportation studies such as taxi dispatching (Xu et al., 2018) and mixed autonomy systems (Wu et al., 2017). But all the existing studies either use SUMO or over-simplified traffic simulator. Third, we plan to better calibrate the simulation parameters by learning from real-world observations. This will make the simulator not only generate data samples fast but also generate “real” data samples.

2. Brief Description

2.1. System Design

CityFlow is a microscopic traffic simulator which simulates the behavior of each vehicle at each time step, providing highest level of detail in the evolution of traffic. However, microscopic traffic simulators are subject to slow simulation speed (Yin and Qiu, 2011). Unlike SUMO, CityFlow uses multithreading to accelerate the simulation. Data structure and simulation algorithm are also optimized to further speedup of the process.

2.1.1. Road Network

Road network is the basic data structure in CityFlow. Road represents a directional road from one intersection to another intersection with road-specific properties. A road may contain multiple lanes. Each lane holds a Linked List of vehicles. Linked List supports fast insertion and searching of leading vehicles. Segments are small fragments of a lane. We design segments in order to efficiently find all vehicles within a certain range of the lane. This structure is crucial for fast lane change operation. Intersection is where roads intersects. An intersection contains several roadlinks. Each roadlink connects two roads of the intersection and can be controlled by traffic signals. A roadlink contains several lanelinks. Each lanelink represents a specific path from one lane of incoming road to one lane of outgoing road. Cross represents the cross point between two lanelinks. This structure is crucial for fast intersection logic.

2.1.2. Car Following Model

The car-following model is the core component CityFlow. It computes the desired speed of each vehicle at next step using information like traffic signal, leading vehicles, etc. and ensures that no collisions occur in the system. Currently, the car following model used in CityFlow is a modification of the model proposed by Stephen Krauß (Krauß, 1998). The key idea is that: the vehicle will drive as fast as possible subject to perfect safety regularization (e.g. being able to stop even if leading vehicle stops using maximum deceleration). Unlike SUMO (Lopez et al., 2018), we use ballistic position update rule instead of Euler position update. Ballistic update yields more realistic dynamics for car-following models based on continuous dynamics especially for larger time-steps (e.g. 1 second) (Treiber and Kanagaraj, 2015).

Basically, vehicles are subject to several speed constraints, maximum speed which meets all these constraints will be chosen. Currently, following constraints are considered:

•

vehicle and driver’s maximum acceleration

•

road speed limit

•

collision free following speed

•

headway time following speed

•

intersection related speed

Due to page limit, we only present the detail of collision free speed computation. It takes $v_{F}$ current speed of following vehicle, $v_{L}$ current speed of leading vehicle, $d_{F}$ maximum deceleration of following vehicle, $d_{L}$ maximum deceleration of leading vehicle, $gap$ current gap between two vehicles, $interval$ the length of each time step as parameters and compute the no-collision-speed $s$ by solving a quadratic equation using equation 1.

[TABLE]

Intersection related speed is handled by intersection logic and is illustrated in the next section.

2.1.3. Intersection Logic

The behavior of vehicles in intersection is complex and it requires careful design to efficiently mimic real world behavior (Krajzewicz and Erdmann, 2013; Fellendorf and Vortisch, 2010). Basically, vehicles in intersection should obey following two rules:

•

fully stop at red signal, stop if possible at yellow signal

•

yield to vehicles with higher priority (e.g. turning vehicles should yield straight-moving vehicles)

To avoid collisions at intersection, it is non-trivial to check if there are vehicles on the opposite lane. The simplest method is to use brute force search to find all vehicles within a certain range and check if they will collide within a certain time period. But this method is very time consuming. Instead, we precompute all the cross points between lanelinks in intersection. When a vehicle approaches the intersection, it will notify all cross points in the intersection about its arrival. The cross points is responsible for deciding which vehicle could pass and which vehicle should yield. The time complexity of our algorithm is $\mathcal{O}(N_{crosspoints})$ . Due to page limit, we omit the detail of our algorithm.

2.1.4. Lane Change Model

Lane change model addresses two questions for a vehicle: when and how to change lane. Vehicles may change lanes when there are more free space on adjacent lanes or a lane change is required to follow its route. Notice that it is slow to traverse all vehicles in adjacent lanes. Instead, by maintaining the vehicle information in segments which are small fragments of each lane, we only need to search for related vehicles in adjacent segments in constant time (up to three segments for each lane), which largely reduce time complexity.

When a vehicle decides to change lane, it needs to find a way to notify other vehicles. Here we use a similar mechanism in SUMO. When a vehicle changes lane, the simulation engine will put a copy of it to its destination lane, called shadow vehicle. A shadow vehicle has the same function as a normal vehicle, and it can become the leader of other vehicles in the car following model. The vehicle and its shadow moves consistently, which is guaranteed by the simulation engine in the way that their speed constraints will be applied to each other. After the lane change finishes, the simulation engine will just remove the original vehicle and let its shadow vehicle replace it.

2.2. Python Interface

In order to support multi-agent reinforcement learning, we provide a python interface via pybind11 (Jakob et al., 2017). User can perform simulation step by step and get various kinds of information about current state, e.g. number of vehicles on lane, speed of vehicles. Besides, we provide interface to control the elements in simulator at each time step. Currently, users can control traffic signals and add vehicles on-the-fly. We plan to support more types of controlling functions such as vehicle behavior control and road property control in the future. Below is a sample usage of python interface.

2.3. Frontend

We provide a web-based Graphic User Interface. User can check the replay output by the simulator. In order to support viewing large-scale simulation, we use WebGL-based library PixiJS333https://github.com/pixijs/pixi.js for fast rendering of vehicles and traffic signals. Figure 1 shows some screenshots of the GUI under several scenarios.

3. Performance

3.1. Efficiency

We compare the performance between SUMO and CityFlow under different scenarios. The experiment runs on Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz. As Figure 2 shows, CityFlow outperforms SUMO in all scenarios from small traffic to large traffic with single thread. The speedup is even more significant with more threads. We achieve about 25 times speedup on large scale $30\times 30$ road networks with tens of thousands of vehicles using 8 threads, which is 72 steps of simulation per second. Besides, CityFlow shows better efficiency when retrieving information of the simulation via python interface. This is mainly because SUMO uses socket for interaction while CityFlow uses pybind11 for seamless C++ and python integration.

3.2. Effectiveness

We evaluate the effectiveness of CityFlow by comparing to SUMO because SUMO is already a widely-used traffic simulator and its effectiveness is acceptable by domain experts. We compare the average duration of vehicles (time for a vehicle to enter and leave the road network) under different traffic volume settings. As Table 1 shows, the difference is within reasonable range.

4. Demo Detail

We plan to demonstrate CityFlow in different traffic scenarios and show its capability to serve as reinforcement learning testbed.

The demo consists of following parts:

•

Simulating traffic in various scenarios, from synthetic grid scenarios to real world scenarios, and from small road networks with dozens of vehicles to large scale networks with tens of thousands of vehicles.

•

Show the effectiveness the car-following model, intersection logic and lane change behavior of the simulator.

•

Show a complete reinforcement learning training episode of optimizing traffic signal plan. Participants can observe gradual improvement of traffic condition during the training.

•

Demo participants can control cycle length, green ratio of traffic signal and change the volume of traffic and see instant feedback of how the traffic condition would change.

We have published a video on Youtube444https://youtu.be/qeE4hRmWONM, which demonstrate the expected effect. The project is under active development and we are likely to add other features (e.g. more map options, vehicle controls) and demonstrate more functions at the conference.

No special hardware is required since we are demonstrating a software project (learning platform). We will bring our laptop. It would be great if a monitor is provided.

5. Summary

We propose CityFlow, an efficient, multi-agent reinforcement learning environment for large scale city traffic scenario. Researchers can use it as a testbed for traffic signal control problem and conduct research on urban mobility. We will demonstrate the usage and some results of RL-controlled traffic signal plan. Also, we are actively developing the project and plan to support more RL scenarios like dynamic vehicle routing, policy of reversible lane or limited lane as well as open source the project in the near future.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Fellendorf and Vortisch (2010) Martin Fellendorf and Peter Vortisch. 2010. Microscopic traffic flow simulator VISSIM. In Fundamentals of traffic simulation . Springer, 63–93.
3Jakob et al . (2017) Wenzel Jakob, Jason Rhinelander, and Dean Moldovan. 2017. pybind 11 – Seamless operability between C++11 and Python. https://github.com/pybind/pybind 11.
4Krajzewicz and Erdmann (2013) Daniel Krajzewicz and Jakob Erdmann. 2013. Road intersection model in SUMO. In 1st SUMO User Conference-SUMO , Vol. 21. 212–220.
5Krauß (1998) Stefan Krauß. 1998. Microscopic modeling of traffic flow: Investigation of collision free vehicle dynamics . Ph.D. Dissertation. Universitat zu Koln.
6Li et al . (2016) Li Li, Yisheng Lv, and Fei-Yue Wang. 2016. Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica 3, 3 (2016), 247–254.
7Li (2017) Yuxi Li. 2017. Deep reinforcement learning: An overview. ar Xiv preprint ar Xiv:1701.07274 (2017).
8Lopez et al . (2018) Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun-Pang Flötteröd, Robert Hilbrich, Leonhard Lücken, Johannes Rummel, Peter Wagner, and Evamarie Wießner. 2018. Microscopic Traffic Simulation using SUMO, In The 21st IEEE International Conference on Intelligent Transportation Systems. IEEE Intelligent Transportation Systems Conference (ITSC) . https://elib.dlr.de/124092/