UniHPR: Unified Human Pose Representation via Singular Value Contrastive Learning

Zhongyu Jiang; Wenhao Chai; Lei Li; Zhuoran Zhou; Cheng-Yen Yang; Jenq-Neng Hwang

arXiv:2510.19078·cs.CV·October 23, 2025

UniHPR: Unified Human Pose Representation via Singular Value Contrastive Learning

Zhongyu Jiang, Wenhao Chai, Lei Li, Zhuoran Zhou, Cheng-Yen Yang, Jenq-Neng Hwang

PDF

Open Access

TL;DR

UniHPR introduces a novel contrastive learning approach to unify human pose representations across multiple modalities, significantly improving pose estimation and retrieval accuracy.

Contribution

The paper presents a new singular value-based contrastive loss for aligning multi-modal human pose embeddings, enhancing cross-modal fusion and downstream task performance.

Findings

01

Achieves 49.9mm MPJPE on Human3.6M dataset

02

Attains 51.6mm PA-MPJPE on 3DPW dataset

03

Enables accurate 2D and 3D pose retrieval

Abstract

In recent years, there has been a growing interest in developing effective alignment pipelines to generate unified representations from different modalities for multi-modal fusion and generation. As an important component of Human-Centric applications, Human Pose representations are critical in many downstream tasks, such as Human Pose Estimation, Action Recognition, Human-Computer Interaction, Object tracking, etc. Human Pose representations or embeddings can be extracted from images, 2D keypoints, 3D skeletons, mesh models, and lots of other modalities. Yet, there are limited instances where the correlation among all of those representations has been clearly researched using a contrastive paradigm. In this paper, we propose UniHPR, a unified Human Pose Representation learning pipeline, which aligns Human Pose embeddings from images, 2D and 3D human poses. To align more than two data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Human Motion and Animation