Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2022
Yi Cheng, Dongyun Lin, Fen Fang, Hao Xuan Woon, Qianli Xu, Ying Sun

TL;DR
This paper describes a novel unsupervised domain adaptation framework for action recognition in videos, which disentangles features and uses verb-noun co-occurrence to improve accuracy, achieving first place in a challenge.
Contribution
It introduces an action-aware domain adaptation method that disentangles features and leverages verb-noun co-occurrence for improved action recognition in unlabeled target domains.
Findings
Achieved first place in the EPIC-KITCHENS-100 UDA Challenge.
Effective disentanglement of action-relevant and irrelevant features.
Improved action recognition accuracy through co-occurrence constraints.
Abstract
In this report, we present the technical details of our submission to the EPIC-KITCHENS-100 Unsupervised Domain Adaptation (UDA) Challenge for Action Recognition 2022. This task aims to adapt an action recognition model trained on a labeled source domain to an unlabeled target domain. To achieve this goal, we propose an action-aware domain adaptation framework that leverages the prior knowledge induced from the action recognition task during the adaptation. Specifically, we disentangle the source features into action-relevant features and action-irrelevant features using the learned action classifier and then align the target features with the action-relevant features. To further improve the action prediction performance, we exploit the verb-noun co-occurrence matrix to constrain and refine the action predictions. Our final submission achieved the first place in terms of top-1 action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Machine Learning in Healthcare
MethodsALIGN
