Learning Shared Representations from Unpaired Data
Amitai Yacobi, Nir Ben-Ari, Ronen Talmon, Uri Shaham

TL;DR
This paper shows that shared multimodal representations can be learned effectively from unpaired data using spectral embeddings, enabling cross-modal tasks without relying on paired samples.
Contribution
It introduces a novel spectral embedding approach that learns shared representations from unpaired data, a significant departure from reliance on paired samples in multimodal learning.
Findings
Effective cross-modal retrieval demonstrated
High performance in zero-shot and cross-domain classification
Potential for universal, modality-independent embeddings
Abstract
Learning shared representations is a primary area of multimodal representation learning. The current approaches to achieve a shared embedding space rely heavily on paired samples from each modality, which are significantly harder to obtain than unpaired ones. In this work, we demonstrate that shared representations can be learned almost exclusively from unpaired data. Our arguments are grounded in the spectral embeddings of the random walk matrices constructed independently from each unimodal representation. Empirical results in computer vision and natural language processing domains support its potential, revealing the effectiveness of unpaired data in capturing meaningful cross-modal relations, demonstrating high capabilities in retrieval tasks, generation, arithmetics, zero-shot, and cross-domain classification. This work, to the best of our knowledge, is the first to demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
