CycDA: Unsupervised Cycle Domain Adaptation from Image to Video
Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne,, Horst Bischof

TL;DR
CycDA introduces a cycle-based unsupervised domain adaptation method that effectively bridges spatial and modality gaps between web images and videos, improving action recognition without labeled video data.
Contribution
The paper proposes a novel cycle-based framework for unsupervised image-to-video domain adaptation, addressing spatial and modality shifts through alternating spatial and spatio-temporal learning.
Findings
Achieves state-of-the-art results on benchmark datasets.
Effectively bridges domain and modality gaps.
Demonstrates benefits of cyclic adaptation in action recognition.
Abstract
Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications
