LangPose: Language-Aligned Motion for Robust 3D Human Pose Estimation
Longyun Liao, Rong Zheng

TL;DR
LangPose introduces a novel framework that aligns motion embeddings with text embeddings of action labels, effectively improving 3D human pose estimation by leveraging semantic information to address occlusion and ambiguity issues.
Contribution
The paper proposes LangPose, a two-stage framework that integrates action knowledge through language alignment, enhancing robustness and accuracy in 3D human pose estimation.
Findings
Achieves state-of-the-art performance on Human3.6M and MPI-INF-3DHP datasets.
Improves MPJPE to 36.7mm on Human3.6M with detected 2D poses.
Enhances robustness by incorporating semantic information and masked modeling techniques.
Abstract
2D-to-3D human pose lifting is an ill-posed problem due to depth ambiguity and occlusion. Existing methods relying on spatial and temporal consistency alone are insufficient to resolve these problems especially in the presence of significant occlusions or high dynamic actions. Semantic information, however, offers a complementary signal that can help disambiguate such cases. To this end, we propose LangPose, a framework that leverages action knowledge by aligning motion embeddings with text embeddings of fine-grained action labels. LangPose operates in two stages: pretraining and fine-tuning. In the pretraining stage, the model simultaneously learns to recognize actions and reconstruct 3D poses from masked and noisy 2D poses. During the fine-tuning stage, the model is further refined using real-world 3D human pose estimation datasets without action labels. Additionally, our framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis
