The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task
Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Christian Federmann,, Philipp Koehn

TL;DR
This paper describes a joint submission for WMT21 quality estimation focusing on target-side word-level post-editing effort, utilizing Levenshtein Transformer training and data augmentation techniques, achieving top performance on English-German.
Contribution
The paper introduces a novel combination of training and data augmentation methods for quality estimation, outperforming baseline systems and ranking first on the MCC metric for English-German.
Findings
Our system is competitive with the OpenKiwi-XLM baseline.
It is the top-ranked system on the MCC metric for English-German.
Data augmentation improves quality estimation performance.
Abstract
This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task. We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation. The techniques we experimented with include Levenshtein Transformer training and data augmentation with a combination of forward, backward, round-trip translation, and pseudo post-editing of the MT output. We demonstrate the competitiveness of our system compared to the widely adopted OpenKiwi-XLM baseline. Our system is also the top-ranking system on the MT MCC metric for the English-German language pair.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Layer Normalization · Dense Connections · Label Smoothing · Byte Pair Encoding · Softmax
