Beyond Physical Connections: Tree Models in Human Pose Estimation
Fang Wang, Yi Li

TL;DR
This paper demonstrates that simple tree models, combining single and combined parts without latent variables, are effective for human pose estimation, outperforming state-of-the-art methods on multiple datasets.
Contribution
It shows that straightforward mixed representations in latent tree models suffice for accurate human pose estimation, challenging the need for complex models.
Findings
No latent variables are needed in the model.
The method outperforms state-of-the-art on LSP and PARSE datasets.
Effective on both human and animal pose datasets.
Abstract
Simple tree models for articulated objects prevails in the last decade. However, it is also believed that these simple tree models are not capable of capturing large variations in many scenarios, such as human pose estimation. This paper attempts to address three questions: 1) are simple tree models sufficient? more specifically, 2) how to use tree models effectively in human pose estimation? and 3) how shall we use combined parts together with single parts efficiently? Assuming we have a set of single parts and combined parts, and the goal is to estimate a joint distribution of their locations. We surprisingly find that no latent variables are introduced in the Leeds Sport Dataset (LSP) during learning latent trees for deformable model, which aims at approximating the joint distributions of body part locations using minimal tree structure. This suggests one can straightforwardly use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
