TL;DR
This paper presents a novel method for multi-person motion capture using uncalibrated multi-view cameras, leveraging human semantics and motion priors to accurately recover human meshes and camera parameters without calibration tools.
Contribution
It introduces a unified approach that estimates camera parameters and human meshes simultaneously from noisy human semantics, eliminating the need for calibration tools or background features.
Findings
Accurate camera parameters and human motions achieved in one-step reconstruction.
Method effectively handles occlusions and inter-person interactions.
No reliance on background features or calibration procedures.
Abstract
Dynamic multi-person mesh recovery has broad applications in sports broadcasting, virtual reality, and video games. However, current multi-view frameworks rely on a time-consuming camera calibration procedure. In this work, we focus on multi-person motion capture with uncalibrated cameras, which mainly faces two challenges: one is that inter-person interactions and occlusions introduce inherent ambiguities for both camera calibration and motion capture; the other is that a lack of dense correspondences can be used to constrain sparse camera geometries in a dynamic multi-person scene. Our key idea is to incorporate motion prior knowledge to simultaneously estimate camera parameters and human meshes from noisy human semantics. We first utilize human information from 2D images to initialize intrinsic and extrinsic parameters. Thus, the approach does not rely on any other calibration tools…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
