Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation

Yuecong Xu; Jianfei Yang; Yunjiao Zhou; Zhenghua Chen; Min Wu; Xiaoli; Li

arXiv:2303.10451·cs.CV·March 21, 2023·1 cites

Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation

Yuecong Xu, Jianfei Yang, Yunjiao Zhou, Zhenghua Chen, Min Wu, Xiaoli, Li

PDF

Open Access

TL;DR

This paper introduces SSA2lign, a novel method for Few-Shot Video-based Domain Adaptation that enhances target domain data with snippet-level augmentation and attentive semantic and statistical alignment, improving cross-domain action recognition.

Contribution

It presents SSA2lign, a new approach that leverages snippet-level augmentation and multi-perspective semantic alignment for FSVDA, addressing limitations of existing methods.

Findings

01

Achieves state-of-the-art results on multiple benchmarks.

02

Effectively expands target domain data with simple augmentation.

03

Improves alignment by considering semantic and statistical features.

Abstract

For video models to be transferred and applied seamlessly across video tasks in varied environments, Video Unsupervised Domain Adaptation (VUDA) has been introduced to improve the robustness and transferability of video models. However, current VUDA methods rely on a vast amount of high-quality unlabeled target data, which may not be available in real-world cases. We thus consider a more realistic \textit{Few-Shot Video-based Domain Adaptation} (FSVDA) scenario where we adapt video models with only a few target video samples. While a few methods have touched upon Few-Shot Domain Adaptation (FSDA) in images and in FSVDA, they rely primarily on spatial augmentation for target domain expansion with alignment performed statistically at the instance level. However, videos contain more knowledge in terms of rich temporal and semantic information, which should be fully considered while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications