Leveraging Endo- and Exo-Temporal Regularization for Black-box Video Domain Adaptation
Yuecong Xu, Jianfei Yang, Haozhi Cao, Min Wu, Xiaoli Li, Lihua Xie,, Zhenghua Chen

TL;DR
This paper introduces EXTERN, a novel black-box video domain adaptation method utilizing endo- and exo-temporal regularizations, achieving state-of-the-art results without requiring source data or model parameters.
Contribution
The paper proposes a new approach for black-box video domain adaptation using temporal regularizations and mask-to-mix strategies, addressing privacy and portability issues.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Outperforms existing methods that require source data.
Effectively adapts across various cross-domain video tasks.
Abstract
To enable video models to be applied seamlessly across video tasks in different environments, various Video Unsupervised Domain Adaptation (VUDA) methods have been proposed to improve the robustness and transferability of video models. Despite improvements made in model robustness, these VUDA methods require access to both source data and source model parameters for adaptation, raising serious data privacy and model portability issues. To cope with the above concerns, this paper firstly formulates Black-box Video Domain Adaptation (BVDA) as a more realistic yet challenging scenario where the source video model is provided only as a black-box predictor. While a few methods for Black-box Domain Adaptation (BDA) are proposed in image domain, these methods cannot apply to video domain since video modality has more complicated temporal features that are harder to align. To address BVDA, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsContrastive Language-Image Pre-training
