Text Modeling with Syntax-Aware Variational Autoencoders
Yijun Xiao, William Yang Wang

TL;DR
This paper introduces syntax-aware variational autoencoders (SAVAEs) that incorporate syntactic structures into text representations, improving reconstruction and enabling syntax modification in generated text.
Contribution
The paper proposes SAVAEs that dedicate a subspace for syntax in the latent space, enhancing text modeling by explicitly capturing syntactic information.
Findings
Lower reconstruction loss on four datasets
Capable of generating text with modified syntax
Effective in capturing syntactic structures
Abstract
Syntactic information contains structures and rules about how text sentences are arranged. Incorporating syntax into text modeling methods can potentially benefit both representation learning and generation. Variational autoencoders (VAEs) are deep generative models that provide a probabilistic way to describe observations in the latent space. When applied to text data, the latent representations are often unstructured. We propose syntax-aware variational autoencoders (SAVAEs) that dedicate a subspace in the latent dimensions dubbed syntactic latent to represent syntactic structures of sentences. SAVAEs are trained to infer syntactic latent from either text inputs or parsed syntax results as well as reconstruct original text with inferred latent variables. Experiments show that SAVAEs are able to achieve lower reconstruction loss on four different data sets. Furthermore, they are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Generative Adversarial Networks and Image Synthesis
