Video Motion Capture from the Part Confidence Maps of Multi-Camera Images by Spatiotemporal Filtering Using the Human Skeletal Model
Takuya Ohashi, Yosuke Ikegami, Kazuki Yamamoto, Wataru Takano,, Yoshihiko Nakamura

TL;DR
This paper introduces a method for 3D human motion capture from multi-camera images using part confidence maps and a spatiotemporal filter based on the human skeletal model, achieving accurate and smooth motion data.
Contribution
It proposes a novel spatiotemporal filtering technique that integrates the human skeletal model into multi-camera motion capture for improved accuracy.
Findings
Mean joint position error of 26.1mm for regular motions
Mean joint position error of 38.8mm for inverted motions
Effective smoothing of human motion data
Abstract
This paper discusses video motion capture, namely, 3D reconstruction of human motion from multi-camera images. After the Part Confidence Maps are computed from each camera image, the proposed spatiotemporal filter is applied to deliver the human motion data with accuracy and smoothness for human motion analysis. The spatiotemporal filter uses the human skeleton and mixes temporal smoothing in two-time inverse kinematics computations. The experimental results show that the mean per joint position error was 26.1mm for regular motions and 38.8mm for inverted motions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Human Motion and Animation
