Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection
Yifan Lu, Gurkirt Singh, Suman Saha, Luc Van Gool

TL;DR
This paper introduces DA-AIM, a novel domain adaptation method for action detection that uses instance-based mixed sampling and auxiliary source domain supervision, significantly improving performance on challenging benchmarks.
Contribution
It proposes a new action instance mixed sampling technique and a training protocol with auxiliary source domain supervision for domain-adaptive action detection.
Findings
DA-AIM outperforms prior methods on benchmark datasets.
The proposed mixed sampling improves domain transfer for action detection.
Using auxiliary source domain supervision addresses long-tail and domain shift issues.
Abstract
We propose a novel domain adaptive action detection approach and a new adaptation protocol that leverages the recent advancements in image-level unsupervised domain adaptation (UDA) techniques and handle vagaries of instance-level video data. Self-training combined with cross-domain mixed sampling has shown remarkable performance gain in semantic segmentation in UDA (unsupervised domain adaptation) context. Motivated by this fact, we propose an approach for human action detection in videos that transfers knowledge from the source domain (annotated dataset) to the target domain (unannotated dataset) using mixed sampling and pseudo-label-based selftraining. The existing UDA techniques follow a ClassMix algorithm for semantic segmentation. However, simply adopting ClassMix for action detection does not work, mainly because these are two entirely different problems, i.e., pixel-label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
