From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video Clips
Aditya Singh, Saurabh Saini, Rajvi Shah, and P J Narayanan

TL;DR
This paper presents a simple domain adaptation method using semantic embeddings and data augmentation to improve action classification in wild social video clips, addressing distribution differences between traditional datasets and real-world vines.
Contribution
Introduces a novel domain adaptation approach leveraging semantic word2vec space and multi-modal features for action classification in unstructured social videos.
Findings
Significant performance improvements on vine video dataset
Effective use of semantic embeddings for domain alignment
Simple yet effective adaptation strategy
Abstract
Short internet video clips like vines present a significantly wild distribution compared to traditional video datasets. In this paper, we focus on the problem of unsupervised action classification in wild vines using traditional labeled datasets. To this end, we use a data augmentation based simple domain adaptation strategy. We utilise semantic word2vec space as a common subspace to embed video features from both, labeled source domain and unlablled target domain. Our method incrementally augments the labeled source with target samples and iteratively modifies the embedding function to bring the source and target distributions together. Additionally, we utilise a multi-modal representation that incorporates noisy semantic information available in form of hash-tags. We show the effectiveness of this simple adaptation technique on a test set of vines and achieve notable improvements in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
