You Only Learn One Query: Learning Unified Human Query for Single-Stage   Multi-Person Multi-Task Human-Centric Perception

Sheng Jin; Shuhuai Li; Tong Li; Wentao Liu; Chen Qian; Ping Luo

arXiv:2312.05525·cs.CV·July 16, 2024·1 cites

You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception

Sheng Jin, Shuhuai Li, Tong Li, Wentao Liu, Chen Qian, Ping Luo

PDF

Open Access 1 Repo

TL;DR

This paper presents HQNet, a unified single-stage framework for multi-person multi-task human-centric perception, introducing a novel Human Query representation and a new benchmark dataset for comprehensive evaluation.

Contribution

The paper introduces a unified Human Query approach for multi-task perception and proposes the COCO-UniHuman benchmark dataset for evaluation.

Findings

01

Achieves state-of-the-art performance among multi-task models

02

Demonstrates competitive results with task-specific models

03

Shows Human Query's strong generalization to new tasks

Abstract

Human-centric perception (e.g. detection, segmentation, pose estimation, and attribute analysis) is a long-standing problem for computer vision. This paper introduces a unified and versatile framework (HQNet) for single-stage multi-person multi-task human-centric perception (HCP). Our approach centers on learning a unified human query representation, denoted as Human Query, which captures intricate instance-level features for individual persons and disentangles complex multi-person scenarios. Although different HCP tasks have been well-studied individually, single-stage multi-task learning of HCP tasks has not been fully exploited in the literature due to the absence of a comprehensive benchmark dataset. To address this gap, we propose COCO-UniHuman benchmark to enable model development and comprehensive evaluation. Experimental results demonstrate the proposed method's state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lishuhuai527/coco-unihuman
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods