2D/3D Pose Estimation and Action Recognition using Multitask Deep   Learning

Diogo C. Luvizon; David Picard; Hedi Tabia

arXiv:1802.09232·cs.CV·March 22, 2018

2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning

Diogo C. Luvizon, David Picard, Hedi Tabia

PDF

2 Repos

TL;DR

This paper introduces a multitask deep learning framework that jointly performs 2D/3D human pose estimation and action recognition, achieving state-of-the-art results efficiently and with end-to-end optimization.

Contribution

The work presents a unified architecture for simultaneous pose estimation and action recognition, demonstrating improved accuracy and training efficiency over separate models.

Findings

01

Achieves state-of-the-art results on four datasets.

02

End-to-end training significantly improves accuracy.

03

Supports multi-category data training seamlessly.

Abstract

Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results. Additionally, we demonstrate that optimization from end-to-end leads to significantly higher accuracy than separated learning. The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.