# Attention Strategies for Multi-Source Sequence-to-Sequence Learning

**Authors:** Jind\v{r}ich Libovick\'y, Jind\v{r}ich Helcl

arXiv: 1704.06567 · 2017-04-24

## TL;DR

This paper introduces two new attention combination methods for multi-source sequence-to-sequence models, demonstrating their effectiveness on translation and post-editing tasks with competitive results.

## Contribution

It proposes flat and hierarchical attention combination strategies specifically designed for multi-source sequence-to-sequence learning, filling a gap in existing attention modeling research.

## Key findings

- Both methods achieve competitive results on WMT16 Multimodal Translation.
- Hierarchical approach outperforms some existing techniques.
- Systematic evaluation confirms effectiveness of proposed methods.

## Abstract

Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. We compare the proposed methods with existing techniques and present results of systematic evaluation of those methods on the WMT16 Multimodal Translation and Automatic Post-editing tasks. We show that the proposed methods achieve competitive results on both tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.06567/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1704.06567/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1704.06567/full.md

---
Source: https://tomesphere.com/paper/1704.06567