Linking Image and Text with 2-Way Nets

Aviv Eisenschtat; Lior Wolf

arXiv:1608.07973·cs.CV·February 14, 2017

Linking Image and Text with 2-Way Nets

Aviv Eisenschtat, Lior Wolf

PDF

1 Repo

TL;DR

This paper introduces a bi-directional neural network architecture that effectively links image and text data by projecting both into a common space using Euclidean loss, achieving state-of-the-art results in matching tasks.

Contribution

The paper presents a novel two-way neural network model that directly links correlation maximization with Euclidean loss, improving data matching across modalities.

Findings

01

Achieved state-of-the-art results on MNIST image matching.

02

Performed well on sentence-image matching on Flickr8k, Flickr30k, and COCO datasets.

03

Linked correlation-based loss with Euclidean loss for effective training.

Abstract

Linking two data sources is a basic building block in numerous computer vision problems. Canonical Correlation Analysis (CCA) achieves this by utilizing a linear optimizer in order to maximize the correlation between the two views. Recent work makes use of non-linear models, including deep learning techniques, that optimize the CCA loss in some feature space. In this paper, we introduce a novel, bi-directional neural network architecture for the task of matching vectors from two data sources. Our approach employs two tied neural network channels that project the two views into a common, maximally correlated space using the Euclidean loss. We show a direct link between the correlation-based loss and Euclidean loss, enabling the use of Euclidean loss for correlation maximization. To overcome common Euclidean regression optimization problems, we modify well-known techniques to our problem,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aviveise/2WayNet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBatch Normalization