TL;DR
This paper introduces a novel neural network model that learns shared representations across multiple views using only pivot-based parallel data, enabling cross-lingual and multimodal tasks without direct pairings.
Contribution
It proposes a generic bridge correlational neural network model that effectively learns common representations across multiple views with only pivot-view data, applicable to n views.
Findings
Achieved state-of-the-art in multilingual document classification.
Demonstrated promising results in multilingual multimodal retrieval.
Validated the model on a new dataset created for this purpose.
Abstract
Recently there has been a lot of interest in learning common representations for multiple views of data. Typically, such common representations are learned using a parallel corpus between the two views (say, 1M images and their English captions). In this work, we address a real-world scenario where no direct parallel data is available between two views of interest (say, and ) but parallel data is available between each of these views and a pivot view (). We propose a model for learning a common representation for , and using only the parallel data available between and . The proposed model is generic and even works when there are views of interest and only one pivot view which acts as a bridge between them. There are two specific downstream applications that we focus on (i) transfer learning between languages ,,...,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
