Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain   Adaptation Challenge for Action Recognition 2022

Yi Cheng; Dongyun Lin; Fen Fang; Hao Xuan Woon; Qianli Xu; Ying Sun

arXiv:2301.12436·cs.CV·January 31, 2023

Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022

Yi Cheng, Dongyun Lin, Fen Fang, Hao Xuan Woon, Qianli Xu, Ying Sun

PDF

Open Access

TL;DR

This paper describes a novel unsupervised domain adaptation framework for action recognition in videos, which disentangles features and uses verb-noun co-occurrence to improve accuracy, achieving first place in a challenge.

Contribution

It introduces an action-aware domain adaptation method that disentangles features and leverages verb-noun co-occurrence for improved action recognition in unlabeled target domains.

Findings

01

Achieved first place in the EPIC-KITCHENS-100 UDA Challenge.

02

Effective disentanglement of action-relevant and irrelevant features.

03

Improved action recognition accuracy through co-occurrence constraints.

Abstract

In this report, we present the technical details of our submission to the EPIC-KITCHENS-100 Unsupervised Domain Adaptation (UDA) Challenge for Action Recognition 2022. This task aims to adapt an action recognition model trained on a labeled source domain to an unlabeled target domain. To achieve this goal, we propose an action-aware domain adaptation framework that leverages the prior knowledge induced from the action recognition task during the adaptation. Specifically, we disentangle the source features into action-relevant features and action-irrelevant features using the learned action classifier and then align the target features with the action-relevant features. To further improve the action prediction performance, we exploit the verb-noun co-occurrence matrix to constrain and refine the action predictions. Our final submission achieved the first place in terms of top-1 action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Machine Learning in Healthcare

MethodsALIGN