Discrete Auto-regressive Variational Attention Models for Text Modeling

Xianghong Fang; Haoli Bai; Jian Li; Zenglin Xu; Michael; Lyu; Irwin King

arXiv:2106.08571·cs.LG·June 17, 2021

Discrete Auto-regressive Variational Attention Models for Text Modeling

Xianghong Fang, Haoli Bai, Jian Li, Zenglin Xu, Michael, Lyu, Irwin King

PDF

Open Access 1 Repo

TL;DR

This paper introduces DAVAM, a novel discrete auto-regressive variational attention model for text modeling that effectively enriches the latent space and avoids posterior collapse, outperforming existing VAEs.

Contribution

The paper proposes a new auto-regressive variational attention mechanism with discrete latent space, addressing information underrepresentation and posterior collapse in VAEs.

Findings

01

DAVAM outperforms several VAE models on language modeling tasks.

02

The model effectively captures semantic dependencies in text.

03

It is mathematically proven to be free from posterior collapse.

Abstract

Variational autoencoders (VAEs) have been widely applied for text modeling. In practice, however, they are troubled by two challenges: information underrepresentation and posterior collapse. The former arises as only the last hidden state of LSTM encoder is transformed into the latent space, which is generally insufficient to summarize the data. The latter is a long-standing problem during the training of VAEs as the optimization is trapped to a disastrous local optimum. In this paper, we propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges. Specifically, we introduce an auto-regressive variational attention approach to enrich the latent space by effectively capturing the semantic dependency from the input. We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunset-clouds/DAVAM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory