Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware   Neural Machine Translation

Artur Nowakowski; Gabriela Pa{\l}ka; Kamil Guttmann and; Miko{\l}aj Pokrywka

arXiv:2209.02962·cs.CL·September 8, 2022·5 cites

Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation

Artur Nowakowski, Gabriela Pa{\l}ka, Kamil Guttmann and, Miko{\l}aj Pokrywka

PDF

Open Access

TL;DR

This paper describes AMU's constrained WMT 2022 submissions for Ukrainian-Czech translation, utilizing ensemble models with source factors, back-translation, document-level modeling, and quality-aware reranking, achieving top automatic evaluation scores.

Contribution

Introduction of a multi-model ensemble with source factors, document-level training, and quality-aware reranking for improved translation quality in constrained settings.

Findings

01

Ranked first in both translation directions by automatic metrics.

02

Effective use of source factors for named entity handling.

03

Improved translation quality through quality-aware reranking.

Abstract

This paper presents Adam Mickiewicz University's (AMU) submissions to the constrained track of the WMT 2022 General MT Task. We participated in the Ukrainian $\leftrightarrow$ Czech translation directions. The systems are a weighted ensemble of four models based on the Transformer (big) architecture. The models use source factors to utilize the information about named entities present in the input. Each of the models in the ensemble was trained using only the data provided by the shared task organizers. A noisy back-translation technique was used to augment the training corpora. One of the models in the ensemble is a document-level model, trained on parallel and synthetic longer sequences. During the sentence-level decoding process, the ensemble generated the n-best list. The n-best list was merged with the n-best list generated by a single document-level model which translated multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Absolute Position Encodings · Softmax · Residual Connection · Position-Wise Feed-Forward Layer · Dropout · Dense Connections