Channel-Temporal Attention for First-Person Video Domain Adaptation

Xianyuan Liu; Shuo Zhou; Tao Lei; Haiping Lu

arXiv:2108.07846·cs.CV·August 20, 2021

Channel-Temporal Attention for First-Person Video Domain Adaptation

Xianyuan Liu, Shuo Zhou, Tao Lei, Haiping Lu

PDF

Open Access

TL;DR

This paper introduces a novel channel-temporal attention mechanism and datasets for unsupervised domain adaptation in first-person video action recognition, significantly improving performance over baselines.

Contribution

It proposes a new attention-based network architecture and two first-person video datasets tailored for domain adaptation tasks.

Findings

01

CTAN outperforms baselines on three datasets.

02

New datasets facilitate first-person domain adaptation research.

03

Attention blocks effectively model channel and temporal relationships.

Abstract

Unsupervised Domain Adaptation (UDA) can transfer knowledge from labeled source data to unlabeled target data of the same categories. However, UDA for first-person action recognition is an under-explored problem, with lack of datasets and limited consideration of first-person video characteristics. This paper focuses on addressing this problem. Firstly, we propose two small-scale first-person video domain adaptation datasets: ADL $_{s ma l l}$ and GTEA-KITCHEN. Secondly, we introduce channel-temporal attention blocks to capture the channel-wise and temporal-wise relationships and model their inter-dependencies important to first-person vision. Finally, we propose a Channel-Temporal Attention Network (CTAN) to integrate these blocks into existing architectures. CTAN outperforms baselines on the two proposed datasets and one existing dataset EPIC $_{c v p r 20}$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications