# Self-Attentional Models for Lattice Inputs

**Authors:** Matthias Sperber, Graham Neubig, Ngoc-Quan Pham, Alex Waibel

arXiv: 1906.01617 · 2019-06-05

## TL;DR

This paper introduces a self-attention based model for lattice inputs in NLP, improving computational efficiency and performance in speech translation tasks by incorporating lattice structures with probabilistic masks.

## Contribution

It extends self-attention models to handle lattice inputs using probabilistic masks and adapted positional embeddings, enabling faster and more effective processing of ambiguous linguistic data.

## Key findings

- Outperforms baseline models in speech translation
- Faster training and inference compared to previous neural lattice models
- Effectively incorporates lattice structure and scores

## Abstract

Lattices are an efficient and effective method to encode ambiguity of upstream systems in natural language processing tasks, for example to compactly capture multiple speech recognition hypotheses, or to represent multiple linguistic analyses. Previous work has extended recurrent neural networks to model lattice inputs and achieved improvements in various tasks, but these models suffer from very slow computation speeds. This paper extends the recently proposed paradigm of self-attention to handle lattice inputs. Self-attention is a sequence modeling technique that relates inputs to one another by computing pairwise similarities and has gained popularity for both its strong results and its computational efficiency. To extend such models to handle lattices, we introduce probabilistic reachability masks that incorporate lattice structure into the model and support lattice scores if available. We also propose a method for adapting positional embeddings to lattice structures. We apply the proposed model to a speech translation task and find that it outperforms all examined baselines while being much faster to compute than previous neural lattice models during both training and inference.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01617/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01617/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/1906.01617/full.md

---
Source: https://tomesphere.com/paper/1906.01617