Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation

Haipeng Chen; Sifan Wu; Zhigang Wang; Yifang Yin; Yingying Jiao,; Yingda Lyu; Zhenguang Liu

arXiv:2501.14356·cs.CV·January 27, 2025

Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation

Haipeng Chen, Sifan Wu, Zhigang Wang, Yifang Yin, Yingying Jiao,, Yingda Lyu, Zhenguang Liu

PDF

Open Access 1 Video

TL;DR

This paper introduces a causal-inspired multitask learning framework for video-based human pose estimation, enhancing robustness and interpretability by modeling causal relationships and prioritizing keypoint-relevant features.

Contribution

It pioneers a causal perspective in pose estimation, integrating auxiliary tasks for causal reasoning and a token importance module for improved interpretability and performance.

Findings

01

Outperforms state-of-the-art on three benchmark datasets

02

Enhances model robustness to challenging scenes

03

Improves interpretability by identifying causal tokens

Abstract

Video-based human pose estimation has long been a fundamental yet challenging problem in computer vision. Previous studies focus on spatio-temporal modeling through the enhancement of architecture design and optimization strategies. However, they overlook the causal relationships in the joints, leading to models that may be overly tailored and thus estimate poorly to challenging scenes. Therefore, adequate causal reasoning capability, coupled with good interpretability of model, are both indispensable and prerequisite for achieving reliable results. In this paper, we pioneer a causal perspective on pose estimation and introduce a causal-inspired multitask learning framework, consisting of two stages. \textit{In the first stage}, we try to endow the model with causal spatio-temporal modeling ability by introducing two self-supervision auxiliary tasks. Specifically, these auxiliary tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation· underline

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods

MethodsFocus