Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

Yiming Bao; Xu Zhao; Dahong Qian

arXiv:2404.17837·cs.CV·April 30, 2024

Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

Yiming Bao, Xu Zhao, Dahong Qian

PDF

Open Access

TL;DR

This paper introduces RTOF, a real-time framework that fuses monocular video and sparse inertial data to improve 3D human pose estimation accuracy, smoothness, and physical plausibility.

Contribution

The novel RTOF framework effectively integrates heterogeneous visual and inertial data for more accurate and realistic 3D human pose estimation.

Findings

01

Significantly reduced pose estimation error on Total Capture dataset

02

Produced smooth and biomechanically plausible human motions

03

Demonstrated efficiency and rationality through ablation studies

Abstract

Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses. In this paper, we propose a novel framework, Real-time Optimization and Fusion (RTOF), to address this issue. We first incorporate sparse inertial orientations into a parametric human skeleton to refine 3D poses in kinematics. The poses are then optimized by energy functions built on both visual and inertial observations to reduce the temporal jitters. Our framework outputs smooth and biomechanically plausible human motion. Comprehensive experiments with ablation studies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Gait Recognition and Analysis