Federated Self-supervised Learning for Video Understanding
Yasar Abbas Ur Rehman, Yan Gao, Jiajun Shen, Pedro Porto Buarque de, Gusmao, Nicholas Lane

TL;DR
This paper introduces FedVSSL, a federated self-supervised learning framework for videos that addresses privacy and communication challenges, outperforming centralized methods on key video retrieval benchmarks.
Contribution
The paper proposes a novel federated SSL framework for videos, FedVSSL, with new aggregation strategies and partial weight updates, improving performance over centralized methods.
Findings
FedVSSL outperforms centralized SOTA by 6.66% on UCF-101
FedVSSL outperforms centralized SOTA by 5.13% on HMDB-51
Effective federated SSL for large-scale video data
Abstract
The ubiquity of camera-enabled mobile devices has lead to large amounts of unlabelled video data being produced at the edge. Although various self-supervised learning (SSL) methods have been proposed to harvest their latent spatio-temporal representations for task-specific training, practical challenges including privacy concerns and communication costs prevent SSL from being deployed at large scales. To mitigate these issues, we propose the use of Federated Learning (FL) to the task of video SSL. In this work, we evaluate the performance of current state-of-the-art (SOTA) video-SSL techniques and identify their shortcomings when integrated into the large-scale FL setting simulated with kinetics-400 dataset. We follow by proposing a novel federated SSL framework for video, dubbed FedVSSL, that integrates different aggregation strategies and partial weight updating. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Privacy-Preserving Technologies in Data · Video Surveillance and Tracking Methods
Methods3D Convolution
