A Latent Variable Recurrent Neural Network for Discourse Relation Language Models
Yangfeng Ji, Gholamreza Haffari, Jacob Eisenstein

TL;DR
This paper introduces a latent variable recurrent neural network that jointly models word sequences and discourse relations, improving performance on discourse classification and language modeling tasks.
Contribution
It proposes a novel architecture that integrates latent discourse relations into RNNs, enhancing both discourse relation classification and language modeling.
Findings
Outperforms state-of-the-art in discourse relation classification
Improves language modeling by marginalizing over discourse relations
Effective in dialog act classification
Abstract
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-of-the-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
