2D Pre-Training for 3D Pose Estimation

Liyao Jiang; Ruichen Chen; Keith G. Mills

arXiv:2604.22830·cs.CV·April 28, 2026

2D Pre-Training for 3D Pose Estimation

Liyao Jiang, Ruichen Chen, Keith G. Mills

PDF

TL;DR

This paper demonstrates that 2D pre-training improves 3D human pose estimation performance and generalization across datasets, offering a more efficient training approach.

Contribution

It extends 3D HPE pre-training to include diverse datasets and systematically studies the impact of 2D pre-training on accuracy and efficiency.

Findings

01

2D pre-training outperforms 3D-only training in accuracy.

02

Pre-training enhances model generalization across datasets.

03

Achieved MPJPE score under 64.5mm on MPII and Human3.6M.

Abstract

Pre-training is a general method that is used in a range of deep learning tasks. By first training a model on one task, and then further training on the downstream task used for final evaluation, the model is forced to learn a more general understanding of the input data. While pre-training has been applied to 3D Human Pose Estimation (HPE) previously, the scope of datasets used is typically very limited to some strong benchmarks, like Human3.6M. Therefore, in this project, we expand the scope of an existing 3D HPE scheme to be compatible with additional 2D and 3D HPE datasets, like Occlusion Person. We perform an extensive study on how aspects of 2D pre-training, such as model size, affect downstream performance, and to what extent pre-training can help the model generalize to different datasets. Experimental results show that 2D pre-training consistently outperforms training on 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.