Pixel-wise Crowd Understanding via Synthetic Data
Qi Wang, Junyu Gao, Wei Lin, Yuan Yuan

TL;DR
This paper introduces a synthetic crowd dataset generated from a video game and proposes methods to improve pixel-wise crowd understanding by leveraging synthetic data, enhancing performance in real-world scenarios.
Contribution
It creates a large-scale synthetic crowd dataset from a video game and proposes two methods—pre-training and domain adaptation—to improve real-world crowd analysis.
Findings
Synthetic data improves model performance on real data.
Pre-training on synthetic data enhances accuracy.
Domain adaptation yields photo-realistic training images.
Abstract
Crowd analysis via computer vision techniques is an important topic in the field of video surveillance, which has wide-spread applications including crowd monitoring, public safety, space design and so on. Pixel-wise crowd understanding is the most fundamental task in crowd analysis because of its finer results for video sequences or still images than other analysis tasks. Unfortunately, pixel-level understanding needs a large amount of labeled training data. Annotating them is an expensive work, which causes that current crowd datasets are small. As a result, most algorithms suffer from over-fitting to varying degrees. In this paper, take crowd counting and segmentation as examples from the pixel-wise crowd understanding, we attempt to remedy these problems from two aspects, namely data and methodology. Firstly, we develop a free data collector and labeler to generate synthetic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition
