Confidence through Attention
Mat\=iss Rikters, Mark Fishel

TL;DR
This paper proposes using attention distributions from neural translation models as a confidence metric to improve translation quality through filtering and hybridization, demonstrating measurable BLEU score improvements.
Contribution
It introduces two novel methods leveraging attention distributions as confidence scores for translation quality assessment and system combination.
Findings
Up to 2.22 BLEU points improvement in filtering tasks
Up to 0.99 BLEU points improvement in hybrid translation
Weak correlation between confidence scores and human judgments
Abstract
Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens. In this work, we use attention distributions as a confidence metric for output translations. We present two strategies of using the attention distributions: filtering out bad translations from a large back-translated corpus, and selecting the best translation in a hybrid setup of two different translation systems. While manual evaluation indicated only a weak correlation between our confidence score and human judgments, the use-cases showed improvements of up to 2.22 BLEU points for filtering and 0.99 points for hybrid translation, tested on English<->German and English<->Latvian translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
