TL;DR
PersonLab is a bottom-up, single-shot model that jointly performs person pose estimation and instance segmentation using part-based modeling and geometric embeddings, achieving state-of-the-art results efficiently.
Contribution
The paper introduces PersonLab, a novel bottom-up approach combining pose estimation and segmentation with geometric embeddings, outperforming previous methods.
Findings
Achieves 0.665 AP for keypoints on COCO test-dev.
First bottom-up method with competitive COCO instance segmentation results.
Runs efficiently with runtime independent of the number of people.
Abstract
We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model. The proposed PersonLab model tackles both semantic-level reasoning and object-part associations using part-based modeling. Our model employs a convolutional network which learns to detect individual keypoints and predict their relative displacements, allowing us to group keypoints into person pose instances. Further, we propose a part-induced geometric embedding descriptor which allows us to associate semantic person pixels with their corresponding person instance, delivering instance-level person segmentations. Our system is based on a fully-convolutional architecture and allows for efficient inference, with runtime essentially independent of the number of people present in the scene. Trained on COCO data alone, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
