A Simple Baseline for Pose Tracking in Videos of Crowded Scenes
Li Yuan, Shuning Chang, Ziyuan Huang, Yichen Zhou, Yunpeng Chen,, Xuecheng Nie, Francis E.H. Tay, Jiashi Feng, Shuicheng Yan

TL;DR
This paper introduces a straightforward approach for pose tracking in crowded video scenes, combining multi-object tracking, pose estimation, and optical flow to improve accuracy in complex environments.
Contribution
It presents a simple baseline that integrates detection, tracking, and optical flow for effective pose tracking in crowded scenes, addressing a gap in existing methods.
Findings
Effective in crowded environments
Combines multi-object tracking with optical flow
Achieves promising results in complex event videos
Abstract
This paper presents our solution to ACM MM challenge: Large-scale Human-centric Video Analysis in Complex Events\cite{lin2020human}; specifically, here we focus on Track3: Crowd Pose Tracking in Complex Events. Remarkable progress has been made in multi-pose training in recent years. However, how to track the human pose in crowded and complex environments has not been well addressed. We formulate the problem as several subproblems to be solved. First, we use a multi-object tracking method to assign human ID to each bounding box generated by the detection model. After that, a pose is generated to each bounding box with ID. At last, optical flow is used to take advantage of the temporal information in the videos and generate the final pose tracking result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
