Structured Attention Networks
Yoon Kim, Carl Denton, Luong Hoang, Alexander M. Rush

TL;DR
This paper introduces structured attention networks that incorporate graphical model-based structural dependencies into deep neural networks, enhancing their ability to model complex structures in tasks like translation and question answering.
Contribution
It presents a simple extension of attention mechanisms to include structured distributions, enabling richer dependency modeling within end-to-end trainable neural networks.
Findings
Structured attention outperforms baseline models on multiple tasks.
Models learn meaningful unsupervised hidden representations.
Effective incorporation of structural biases improves task performance.
Abstract
Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network. However, for many tasks we may want to model richer structural dependencies without abandoning end-to-end training. In this work, we experiment with incorporating richer structural distributions, encoded using graphical models, within deep networks. We show that these structured attention networks are simple extensions of the basic attention procedure, and that they allow for extending attention beyond the standard soft-selection approach, such as attending to partial segmentations or to subtrees. We experiment with two different classes of structured attention networks: a linear-chain conditional random field and a graph-based parsing model, and describe how these models can be practically implemented as neural network layers. Experiments show that this approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
