Effective Combination of Language and Vision Through Model Composition   and the R-CCA Method

Hagar Loeub; Roi Reichart

arXiv:1609.08810·cs.CL·October 5, 2016

Effective Combination of Language and Vision Through Model Composition and the R-CCA Method

Hagar Loeub, Roi Reichart

PDF

Open Access

TL;DR

This paper introduces the R-CCA method for combining textual and visual data in vector space models, demonstrating that sequential composition of various modeling techniques enhances semantic representation quality.

Contribution

The paper proposes the R-CCA method and a sequential modeling framework that outperforms existing multimodal representation learning approaches on standard benchmarks.

Findings

01

R-CCA improves multimodal word representations.

02

Sequential composition of models yields superior performance.

03

Our approach achieves state-of-the-art results on multiple benchmarks.

Abstract

We address the problem of integrating textual and visual information in vector space models for word meaning representation. We first present the Residual CCA (R-CCA) method, that complements the standard CCA method by representing, for each modality, the difference between the original signal and the signal projected to the shared, max correlation, space. We then show that constructing visual and textual representations and then post-processing them through composition of common modeling motifs such as PCA, CCA, R-CCA and linear interpolation (a.k.a sequential modeling) yields high quality models. On five standard semantic benchmarks our sequential models outperform recent multimodal representation learning alternatives, including ones that rely on joint representation learning. For two of these benchmarks our R-CCA method is part of the Best configuration our algorithm yields.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling

MethodsPrincipal Components Analysis