Describe me if you can! Characterized Instance-level Human Parsing
Angelique Loesch, Romaric Audigier

TL;DR
This paper introduces a new dataset with detailed attribute labels for multi-instance human parsing and proposes a fast, accurate transformer-based method for characterizing human attributes in images.
Contribution
The paper presents the CCIHP dataset with new attribute labels and the HPTR transformer-based method for efficient, precise multi-instance human parsing.
Findings
CCIHP dataset with 20 new attribute labels
HPTR is the fastest multi-HP method
HPTR achieves comparable accuracy to state-of-the-art
Abstract
Several computer vision applications such as person search or online fashion rely on human description. The use of instance-level human parsing (HP) is therefore relevant since it localizes semantic attributes and body parts within a person. But how to characterize these attributes? To our knowledge, only some single-HP datasets describe attributes with some color, size and/or pattern characteristics. There is a lack of dataset for multi-HP in the wild with such characteristics. In this article, we propose the dataset CCIHP based on the multi-HP dataset CIHP, with 20 new labels covering these 3 kinds of characteristics. In addition, we propose HPTR, a new bottom-up multi-task method based on transformers as a fast and scalable baseline. It is the fastest method of multi-HP state of the art while having precision comparable to the most precise bottom-up method. We hope this will…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
