Real-Time Human Motion Capture with Multiple Depth Cameras
Alireza Shafaei, James J. Little

TL;DR
This paper introduces a real-time, markerless human motion capture system using multiple Kinect depth cameras, leveraging synthetic data and image segmentation to accurately estimate 3D body poses without markers or user cooperation.
Contribution
It presents a novel multi-camera, markerless motion capture approach that relaxes camera placement constraints and uses synthetic training data for improved accuracy.
Findings
Achieves real-time 3D pose estimation from multiple depth cameras.
Outperforms previous methods on the Berkeley MHAD dataset.
Introduces a large synthetic depth frame dataset for pose estimation.
Abstract
Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
