Beyond Just Vision: A Review on Self-Supervised Representation Learning   on Multimodal and Temporal Data

Shohreh Deldari; Hao Xue; Aaqib Saeed; Jiayuan He; Daniel V. Smith,; Flora D. Salim

arXiv:2206.02353·cs.LG·June 9, 2022·25 cites

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Shohreh Deldari, Hao Xue, Aaqib Saeed, Jiayuan He, Daniel V. Smith,, Flora D. Salim

PDF

Open Access

TL;DR

This paper provides a comprehensive review of self-supervised representation learning methods for multimodal and temporal data, highlighting their architectures, objectives, applications, and future challenges.

Contribution

It is the first review to systematically categorize and analyze multimodal SSRL methods specifically for temporal data across various modalities.

Findings

01

Categorization of SSRL methods and their key components

02

Comparison of models based on objectives and architectures

03

Identification of current weaknesses and future research directions

Abstract

Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in the field of computer vision, speech, natural language processing (NLP), and recently, with other types of modalities, including time series from sensors. The popularity of self-supervised learning is driven by the fact that traditional models typically require a huge amount of well-annotated data for training. Acquiring annotated data can be a difficult and costly process. Self-supervised methods have been introduced to improve the efficiency of training data through discriminative pre-training of models using supervisory signals that have been freely obtained from the raw data. Unlike existing reviews of SSRL that have pre-dominately focused upon methods in the fields of CV or NLP for a single modality, we aim to provide the first comprehensive review of multimodal self-supervised learning methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Topic Modeling · Domain Adaptation and Few-Shot Learning