Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc Van Gool

TL;DR
This paper introduces a novel end-to-end framework for instance-aware human parsing that combines semantic segmentation and pose estimation using a differentiable grouping method, improving accuracy and efficiency.
Contribution
It proposes a multi-granularity learning approach with a differentiable bipartite matching for pixel grouping, enabling direct supervision and eliminating complex post-processing.
Findings
Outperforms existing bottom-up methods in accuracy.
Achieves more efficient inference on multiple datasets.
Enables end-to-end training with direct error back-propagation.
Abstract
To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. It is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, a differentiable solution is developed to exploit projected gradient descent and Dykstra's cyclic projection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
