Toward a Real-Time Framework for Accurate Monocular 3D Human Pose Estimation with Geometric Priors

Mohamed Adjel (LAAS)

arXiv:2507.16850·cs.CV·July 24, 2025

Toward a Real-Time Framework for Accurate Monocular 3D Human Pose Estimation with Geometric Priors

Mohamed Adjel (LAAS)

PDF

Open Access

TL;DR

This paper introduces a real-time monocular 3D human pose estimation framework that combines 2D keypoint detection with geometric priors, leveraging camera and anatomical knowledge for improved accuracy and efficiency in unconstrained environments.

Contribution

It presents a novel framework integrating real-time 2D detection with geometry-aware 2D-to-3D lifting using camera and biomechanical priors, enabling accurate monocular 3D pose estimation without heavy models.

Findings

01

Achieves real-time 3D pose estimation from monocular images.

02

Leverages camera intrinsics and anatomical priors for improved accuracy.

03

Generates large-scale training data from MoCap and synthetic datasets.

Abstract

Monocular 3D human pose estimation remains a challenging and ill-posed problem, particularly in real-time settings and unconstrained environments. While direct imageto-3D approaches require large annotated datasets and heavy models, 2D-to-3D lifting offers a more lightweight and flexible alternative-especially when enhanced with prior knowledge. In this work, we propose a framework that combines real-time 2D keypoint detection with geometry-aware 2D-to-3D lifting, explicitly leveraging known camera intrinsics and subject-specific anatomical priors. Our approach builds on recent advances in self-calibration and biomechanically-constrained inverse kinematics to generate large-scale, plausible 2D-3D training pairs from MoCap and synthetic datasets. We discuss how these ingredients can enable fast, personalized, and accurate 3D pose estimation from monocular images without requiring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Video Surveillance and Tracking Methods