Multi-task Learning For Joint Action and Gesture Recognition

Konstantinos Spathis; Nikolaos Kardaris; Petros Maragos

arXiv:2505.17867·cs.CV·May 26, 2025

Multi-task Learning For Joint Action and Gesture Recognition

Konstantinos Spathis, Nikolaos Kardaris, Petros Maragos

PDF

TL;DR

This paper demonstrates that multi-task learning for joint action and gesture recognition enhances efficiency, robustness, and generalization by leveraging shared representations, outperforming single-task methods across multiple datasets.

Contribution

It introduces a multi-task learning framework that jointly recognizes actions and gestures, showing improved performance over separate models.

Findings

01

Joint models outperform single-task models on multiple datasets.

02

Multi-task learning improves robustness and generalization.

03

Shared representations benefit both action and gesture recognition.

Abstract

In practical applications, computer vision tasks often need to be addressed simultaneously. Multitask learning typically achieves this by jointly training a single deep neural network to learn shared representations, providing efficiency and improving generalization. Although action and gesture recognition are closely related tasks, since they focus on body and hand movements, current state-of-the-art methods handle them separately. In this paper, we show that employing a multi-task learning paradigm for action and gesture recognition results in more efficient, robust and generalizable visual representations, by leveraging the synergies between these tasks. Extensive experiments on multiple action and gesture datasets demonstrate that handling actions and gestures in a single architecture can achieve better performance for both tasks in comparison to their single-task learning variants.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.