A Data-driven Approach for Human Pose Tracking Based on Spatio-temporal Pictorial Structure
Soumitra Samanta, Bhabatosh Chanda

TL;DR
This paper introduces a data-driven human pose tracking method in videos using a spatio-temporal pictorial structure model, combining pose estimation and object tracking for improved accuracy and efficiency.
Contribution
It formulates human pose tracking as a discrete optimization problem solved efficiently with a greedy approach, integrating appearance, temporal, and spatial cues.
Findings
Effective on multiple benchmark datasets
Outperforms existing pose tracking methods
Demonstrates robustness on new ICDPose dataset
Abstract
In this paper, we present a data-driven approach for human pose tracking in video data. We formulate the human pose tracking problem as a discrete optimization problem based on spatio-temporal pictorial structure model and solve this problem in a greedy framework very efficiently. We propose the model to track the human pose by combining the human pose estimation from single image and traditional object tracking in a video. Our pose tracking objective function consists of the following terms: likeliness of appearance of a part within a frame, temporal displacement of the part from previous frame to the current frame, and the spatial dependency of a part with its parent in the graph structure. Experimental evaluation on benchmark datasets (VideoPose2, Poses in the Wild and Outdoor Pose) as well as on our newly build ICDPose dataset shows the usefulness of our proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Video Analysis and Summarization
