Fine-Grained Classification of Pedestrians in Video: Benchmark and State of the Art
David Hall, Pietro Perona

TL;DR
This paper introduces a comprehensive video dataset for fine-grained pedestrian classification, tracking, detection, and pose estimation, along with benchmark results of state-of-the-art algorithms to facilitate future research.
Contribution
It provides a new in-the-wild pedestrian dataset with detailed annotations and benchmarks for multiple tasks, advancing research in fine-grained pedestrian analysis.
Findings
State-of-the-art algorithms achieve baseline performance on the dataset.
The dataset includes 27,454 labels across 4222 tracks.
Benchmark results highlight current algorithm strengths and limitations.
Abstract
A video dataset that is designed to study fine-grained categorisation of pedestrians is introduced. Pedestrians were recorded "in-the-wild" from a moving vehicle. Annotations include bounding boxes, tracks, 14 keypoints with occlusion information and the fine-grained categories of age (5 classes), sex (2 classes), weight (3 classes) and clothing style (4 classes). There are a total of 27,454 bounding box and pose labels across 4222 tracks. This dataset is designed to train and test algorithms for fine-grained categorisation of people, it is also useful for benchmarking tracking, detection and pose estimation of pedestrians. State-of-the-art algorithms for fine-grained classification and pose estimation were tested using the dataset and the results are reported as a useful performance baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Human Pose and Action Recognition
