Modality-Collaborative Low-Rank Decomposers for Few-Shot Video Domain Adaptation
Yuyang Wanyan, Xiaoshan Yang, Weiming Dong, and Changsheng Xu

TL;DR
This paper introduces a novel framework called MC-LRD for few-shot video domain adaptation, effectively decomposing and aligning multimodal features to improve cross-domain generalization in videos.
Contribution
We propose Modality-Collaborative Low-Rank Decomposers (MC-LRD), a new method that decomposes modality-specific and shared features with domain-aware alignment for few-shot video adaptation.
Findings
Significant performance improvements on three benchmarks.
Effective decomposition of modality-unique and shared features.
Enhanced domain alignment through cross-domain activation consistency.
Abstract
In this paper, we study the challenging task of Few-Shot Video Domain Adaptation (FSVDA). The multimodal nature of videos introduces unique challenges, necessitating the simultaneous consideration of both domain alignment and modality collaboration in a few-shot scenario, which is ignored in previous literature. We observe that, under the influence of domain shift, the generalization performance on the target domain of each individual modality, as well as that of fused multimodal features, is constrained. Because each modality is comprised of coupled features with multiple components that exhibit different domain shifts. This variability increases the complexity of domain adaptation, thereby reducing the effectiveness of multimodal feature integration. To address these challenges, we introduce a novel framework of Modality-Collaborative LowRank Decomposers (MC-LRD) to decompose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
