Bootstrapping Human Optical Flow and Pose
Aritro Roy Arko, James J. Little, Kwang Moo Yi

TL;DR
This paper introduces a bootstrapping framework that jointly improves human optical flow and pose estimation by mutual refinement, achieving state-of-the-art results on multiple datasets.
Contribution
It presents a novel joint optimization approach that enhances both tasks simultaneously by leveraging their interdependence during training.
Findings
Achieves state-of-the-art pose estimation accuracy.
Improves optical flow accuracy at human joints.
Demonstrates benefits of joint task optimization.
Abstract
We propose a bootstrapping framework to enhance human optical flow and pose. We show that, for videos involving humans in scenes, we can improve both the optical flow and the pose estimation quality of humans by considering the two tasks at the same time. We enhance optical flow estimates by fine-tuning them to fit the human pose estimates and vice versa. In more detail, we optimize the pose and optical flow networks to, at inference time, agree with each other. We show that this results in state-of-the-art results on the Human 3.6M and 3D Poses in the Wild datasets, as well as a human-related subset of the Sintel dataset, both in terms of pose estimation accuracy and the optical flow accuracy at human joint locations. Code available at https://github.com/ubc-vision/bootstrapping-human-optical-flow-and-pose
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
