SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments
Yudi Dai (1), Yitai Lin (1), Xiping Lin (2), Chenglu Wen (1), Lan Xu, (2), Hongwei Yi (3), Siqi Shen (1), Yuexin Ma (2), Cheng Wang (1) ((1) Xiamen, University, China, (2) ShanghaiTech University, China, (3) Max Planck, Institute for Intelligent Systems, Germany)

TL;DR
SLOPER4D is a comprehensive scene-aware dataset capturing urban human-scene interactions from egocentric views, enabling advanced research in global 4D human pose estimation with detailed annotations and benchmark tasks.
Contribution
The paper introduces SLOPER4D, a large-scale, scene-aware dataset with novel joint optimization for accurate 3D ground truth in dynamic urban environments.
Findings
Existing methods struggle with SLOPER4D's complexity.
Benchmark results highlight challenges in current 3D human pose estimation.
The dataset enables new research directions in urban human-scene interaction.
Abstract
We present SLOPER4D, a novel scene-aware dataset collected in large urban environments to facilitate the research of global human pose estimation (GHPE) with human-scene interaction in the wild. Employing a head-mounted device integrated with a LiDAR and camera, we record 12 human subjects' activities over 10 diverse urban scenes from an egocentric view. Frame-wise annotations for 2D key points, 3D pose parameters, and global translations are provided, together with reconstructed scene point clouds. To obtain accurate 3D ground truth in such large dynamic scenes, we propose a joint optimization method to fit local SMPL meshes to the scene and fine-tune the camera calibration during dynamic motions frame by frame, resulting in plausible and scene-natural 3D human poses. Eventually, SLOPER4D consists of 15 sequences of human motions, each of which has a trajectory length of more than 200…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging
