Human-Centric Transformer for Domain Adaptive Action Recognition

Kun-Yu Lin; Jiaming Zhou; Wei-Shi Zheng

arXiv:2407.10860·cs.CV·July 16, 2024

Human-Centric Transformer for Domain Adaptive Action Recognition

Kun-Yu Lin, Jiaming Zhou, Wei-Shi Zheng

PDF

Open Access

TL;DR

This paper introduces HCTransformer, a human-centric transformer model that improves domain adaptive action recognition by explicitly focusing on human cues and human-context interactions, leading to state-of-the-art results.

Contribution

The paper proposes a novel human-centric learning paradigm with a decoupled transformer architecture to better exploit human cues for domain adaptation in action recognition.

Findings

01

Achieves state-of-the-art performance on three benchmark datasets.

02

Effectively preserves human cues during domain-invariant feature learning.

03

Demonstrates the importance of human-centric cues in cross-domain action recognition.

Abstract

We study the domain adaptation task for action recognition, namely domain adaptive action recognition, which aims to effectively transfer action recognition power from a label-sufficient source domain to a label-free target domain. Since actions are performed by humans, it is crucial to exploit human cues in videos when recognizing actions across domains. However, existing methods are prone to losing human cues but prefer to exploit the correlation between non-human contexts and associated actions for recognition, and the contexts of interest agnostic to actions would reduce recognition performance in the target domain. To overcome this problem, we focus on uncovering human-centric action cues for domain adaptive action recognition, and our conception is to investigate two aspects of human-centric action cues, namely human cues and human-context interaction cues. Accordingly, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Focus · Linear Layer · Label Smoothing · Adam · Dropout · Multi-Head Attention · Dense Connections