On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Pavlo Melnyk; Cuong Le; Urs Waldmann; Per-Erik Forss\'en; and Bastian Wandt

arXiv:2601.13913·cs.CV·January 21, 2026

On the Role of Rotation Equivariance in Monocular 3D Human Pose Estimation

Pavlo Melnyk, Cuong Le, Urs Waldmann, Per-Erik Forss\'en, and Bastian Wandt

PDF

Open Access

TL;DR

This paper demonstrates that incorporating 2D rotation equivariance into monocular 3D human pose estimation models improves accuracy and can be effectively learned through data augmentation, outperforming existing equivariant-by-design approaches.

Contribution

The study shows that 2D rotation equivariance enhances monocular 3D human pose estimation and can be learned without explicit architectural constraints, simplifying the model design.

Findings

01

Rotation equivariance improves pose estimation accuracy.

02

Augmentation-based learning outperforms equivariant-by-design methods.

03

Model achieves state-of-the-art results on standard benchmarks.

Abstract

Estimating 3D from 2D is one of the central tasks in computer vision. In this work, we consider the monocular setting, i.e. single-view input, for 3D human pose estimation (HPE). Here, the task is to predict a 3D point set of human skeletal joints from a single 2D input image. While by definition this is an ill-posed problem, recent work has presented methods that solve it with up to several-centimetre error. Typically, these methods employ a two-step approach, where the first step is to detect the 2D skeletal joints in the input image, followed by the step of 2D-to-3D lifting. We find that common lifting models fail when encountering a rotated input. We argue that learning a single human pose along with its in-plane rotations is considerably easier and more geometrically grounded than directly learning a point-to-point mapping. Furthermore, our intuition is that endowing the model with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Gait Recognition and Analysis