Visual Person Understanding through Multi-Task and Multi-Dataset   Learning

Kilian Pfeiffer; Alexander Hermans; Istv\'an S\'ar\'andi; Mark Weber,; Bastian Leibe

arXiv:1906.03019·cs.CV·November 10, 2020

Visual Person Understanding through Multi-Task and Multi-Dataset Learning

Kilian Pfeiffer, Alexander Hermans, Istv\'an S\'ar\'andi, Mark Weber,, Bastian Leibe

PDF

TL;DR

This paper presents a multi-task learning approach for comprehensive person understanding, combining multiple datasets to improve performance across re-identification, attribute classification, body segmentation, and pose estimation.

Contribution

It introduces a method to jointly learn multiple person-related tasks from diverse datasets without performance loss, suitable for resource-limited environments.

Findings

01

Multi-task learning improves overall accuracy.

02

Combining datasets enhances model robustness.

03

Shared parameters do not significantly increase computational cost.

Abstract

We address the problem of learning a single model for person re-identification, attribute classification, body part segmentation, and pose estimation. With predictions for these tasks we gain a more holistic understanding of persons, which is valuable for many applications. This is a classical multi-task learning problem. However, no dataset exists that these tasks could be jointly learned from. Hence several datasets need to be combined during training, which in other contexts has often led to reduced performance in the past. We extensively evaluate how the different task and datasets influence each other and how different degrees of parameter sharing between the tasks affect performance. Our final model matches or outperforms its single-task counterparts without creating significant computational overhead, rendering it highly interesting for resource-constrained scenarios such as…

Figures40

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.