CityPersons: A Diverse Dataset for Pedestrian Detection
Shanshan Zhang, Rodrigo Benenson, Bernt Schiele

TL;DR
This paper introduces CityPersons, a diverse pedestrian dataset built on Cityscapes, and demonstrates how it enables training a single CNN model that generalizes well across multiple benchmarks, improving detection especially in challenging scenarios.
Contribution
The paper presents the CityPersons dataset with diverse annotations and shows how it improves pedestrian detection performance and generalization across benchmarks.
Findings
FasterRCNN achieves state-of-the-art results on Caltech with new adaptations.
CityPersons enables training a single model that generalizes across benchmarks.
Improved detection of occluded and small-scale pedestrians.
Abstract
Convnets have enabled significant progress in pedestrian detection recently, but there are still open questions regarding suitable architectures and training data. We revisit CNN design and point out key adaptations, enabling plain FasterRCNN to obtain state-of-the-art results on the Caltech dataset. To achieve further improvement from more and better data, we introduce CityPersons, a new set of person annotations on top of the Cityscapes dataset. The diversity of CityPersons allows us for the first time to train one single CNN model that generalizes well over multiple benchmarks. Moreover, with additional training with CityPersons, we obtain top results using FasterRCNN on Caltech, improving especially for more difficult cases (heavy occlusion and small scale) and providing higher localization quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Indoor and Outdoor Localization Technologies
