Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation   in Video

Naoki Kato; Hiroto Honda; Yusuke Uchida

arXiv:2011.02172·cs.CV·November 5, 2020

Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video

Naoki Kato, Hiroto Honda, Yusuke Uchida

PDF

Open Access

TL;DR

This paper introduces a method that leverages temporal joint depth information in videos to enhance the accuracy of 3D human pose estimation, addressing depth ambiguity issues present in 2D pose predictions.

Contribution

It proposes a novel approach that refines 3D human poses by incorporating temporal joint depth information, improving accuracy over existing methods.

Findings

01

Reduced depth ambiguity in 3D pose estimation

02

Improved accuracy in 3D human pose predictions

03

Effective use of temporal information in videos

Abstract

The effectiveness of the approaches to predict 3D poses from 2D poses estimated in each frame of a video has been demonstrated for 3D human pose estimation. However, 2D poses without appearance information of persons have much ambiguity with respect to the joint depths. In this paper, we propose to estimate a 3D pose in each frame of a video and refine it considering temporal information. The proposed approach reduces the ambiguity of the joint depths and improves the 3D pose estimation accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Diabetic Foot Ulcer Assessment and Management