Improving Pose Estimation through Contextual Activity Fusion
David Poulton, Richard Klein

TL;DR
This paper introduces a method to improve pose estimation accuracy by integrating activity context into existing deep learning models, demonstrating enhanced performance especially in challenging scenarios.
Contribution
The paper proposes a novel activity fusion technique using a 1x1 convolution to incorporate activity information into pose estimation architectures.
Findings
Performance improved in uncommon poses
Enhanced accuracy on difficult joints
Ablative analysis confirms activity context benefits
Abstract
This research presents the idea of activity fusion into existing Pose Estimation architectures to enhance their predictive ability. This is motivated by the rise in higher level concepts found in modern machine learning architectures, and the belief that activity context is a useful piece of information for the problem of pose estimation. To analyse this concept we take an existing deep learning architecture and augment it with an additional 1x1 convolution to fuse activity information into the model. We perform evaluation and comparison on a common pose estimation dataset, and show a performance improvement over our baseline model, especially in uncommon poses and on typically difficult joints. Additionally, we perform an ablative analysis to indicate that the performance improvement does in fact draw from the activity information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · 1x1 Convolution
