Deep Generative Model for Joint Alignment and Word Representation

Miguel Rios; Wilker Aziz; Khalil Sima'an

arXiv:1802.05883·cs.CL·April 24, 2018

Deep Generative Model for Joint Alignment and Word Representation

Miguel Rios, Wilker Aziz, Khalil Sima'an

PDF

1 Repo

TL;DR

This paper introduces EmbedAlign, a deep generative model that jointly learns word embeddings and alignments using translation data, representing words as probability distributions for improved semantic comparison.

Contribution

It presents a novel approach that combines joint alignment and embedding learning with distributional representations, advancing lexical semantics modeling.

Findings

01

Achieves competitive results on natural language inference tasks.

02

Outperforms baseline models on paraphrasing benchmarks.

03

Demonstrates effective use of distributional word representations.

Abstract

This work exploits translation data as a source of semantically relevant learning signal for models of word representation. In particular, we exploit equivalence through translation as a form of distributed context and jointly learn how to embed and align with a deep generative model. Our EmbedAlign model embeds words in their complete observed context and learns by marginalisation of latent lexical alignments. Besides, it embeds words as posterior probability densities, rather than point estimates, which allows us to compare words in context using a measure of overlap between distributions (e.g. KL divergence). We investigate our model's performance on a range of lexical semantics tasks achieving competitive results on several standard benchmarks including natural language inference, paraphrasing, and text similarity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uva-slpl/embedalign
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.