UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

Zhongyu Jiang; Wenhao Chai; Lei Li; Zhuoran Zhou; Cheng-Yen Yang,; Jenq-Neng Hwang

arXiv:2311.16477·cs.CV·November 29, 2023·2 cites

UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

Zhongyu Jiang, Wenhao Chai, Lei Li, Zhuoran Zhou, Cheng-Yen Yang,, Jenq-Neng Hwang

PDF

Open Access

TL;DR

UniHPE introduces a unified framework that aligns multiple human pose estimation modalities using contrastive learning, significantly improving accuracy in 2D and 3D pose estimation tasks.

Contribution

The paper presents a novel unified pipeline and a singular value based contrastive loss to align 2D, lifting-based, and image-based 3D human pose modalities simultaneously.

Findings

01

Achieves MPJPE of 50.5mm on Human3.6M

02

Achieves PAMPJPE of 51.6mm on 3DPW

03

Demonstrates improved multi-modal pose estimation performance.

Abstract

In recent times, there has been a growing interest in developing effective perception techniques for combining information from multiple modalities. This involves aligning features obtained from diverse sources to enable more efficient training with larger datasets and constraints, as well as leveraging the wealth of information contained in each modality. 2D and 3D Human Pose Estimation (HPE) are two critical perceptual tasks in computer vision, which have numerous downstream applications, such as Action Recognition, Human-Computer Interaction, Object tracking, etc. Yet, there are limited instances where the correlation between Image and 2D/3D human pose has been clearly researched using a contrastive paradigm. In this paper, we propose UniHPE, a unified Human Pose Estimation pipeline, which aligns features from all three modalities, i.e., 2D human pose estimation, lifting-based and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging

MethodsALIGN · Contrastive Learning