Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li, Jing Jiang

TL;DR
This paper introduces the Two-Headed Monster co-attention mechanism and Crossed Co-Attention Networks, enhancing neural translation models by leveraging symmetric encoder modules with co-attention, leading to improved translation performance.
Contribution
The paper proposes a novel co-attention paradigm called Two-Headed Monster and implements it as Crossed Co-Attention Networks based on Transformer, demonstrating improved translation results.
Findings
CCNs outperform Transformer baseline by up to 0.74 BLEU points.
The Two-Headed Monster paradigm effectively models dual encoder interactions.
Experimental results on WMT datasets validate the approach's effectiveness.
Abstract
This paper presents some preliminary investigations of a new co-attention mechanism in neural transduction models. We propose a paradigm, termed Two-Headed Monster (THM), which consists of two symmetric encoder modules and one decoder module connected with co-attention. As a specific and concrete implementation of THM, Crossed Co-Attention Networks (CCNs) are designed based on the Transformer model. We demonstrate CCNs on WMT 2014 EN-DE and WMT 2016 EN-FI translation tasks and our model outperforms the strong Transformer baseline by 0.51 (big) and 0.74 (base) BLEU points on EN-DE and by 0.17 (big) and 0.47 (base) BLEU points on EN-FI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
