DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video

Hao Wen; Hongbo Kang; Jian Ma; Jing Huang; Yuanwang Yang; Haozhe Lin; Yu-Kun Lai; Kun Li

arXiv:2508.12644·cs.CV·August 19, 2025

DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video

Hao Wen, Hongbo Kang, Jian Ma, Jing Huang, Yuanwang Yang, Haozhe Lin, Yu-Kun Lai, Kun Li

PDF

Open Access

TL;DR

DyCrowd introduces a novel framework for spatio-temporally consistent 3D reconstruction of large crowds in videos, effectively handling occlusions and temporal inconsistencies through a group-guided optimization and motion prior.

Contribution

The paper presents the first framework for dynamic crowd reconstruction from large-scene videos, incorporating a group-guided motion optimization and a VAE-based human motion prior.

Findings

01

Achieves state-of-the-art performance in large-scene crowd reconstruction

02

Effectively handles occlusions and temporal inconsistencies

03

Provides a new virtual benchmark dataset for evaluation

Abstract

3D reconstruction of dynamic crowds in large scenes has become increasingly important for applications such as city surveillance and crowd analysis. However, current works attempt to reconstruct 3D crowds from a static image, causing a lack of temporal consistency and inability to alleviate the typical impact caused by occlusions. In this paper, we propose DyCrowd, the first framework for spatio-temporally consistent 3D reconstruction of hundreds of individuals' poses, positions and shapes from a large-scene video. We design a coarse-to-fine group-guided motion optimization strategy for occlusion-robust crowd reconstruction in large scenes. To address temporal instability and severe occlusions, we further incorporate a VAE (Variational Autoencoder)-based human motion prior along with a segment-level group-guided optimization. The core of our strategy leverages collective crowd behavior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Advanced Vision and Imaging · Video Analysis and Summarization