The Ubiqus English-Inuktitut System for WMT20
Fran\c{c}ois Hernandez, Vincent Nguyen

TL;DR
This paper presents Ubiqus' multilingual Transformer-based system for English-Inuktitut translation in WMT20, addressing challenges of low-resource and agglutinative language complexities through joint training and careful data handling.
Contribution
It introduces a multilingual Transformer approach trained on multiple agglutinative languages specifically for English-Inuktitut translation, tackling low-resource and linguistic challenges.
Findings
Effective multilingual training improves translation quality.
Handling language-specific features enhances system robustness.
Addressed low-resource data limitations successfully.
Abstract
This paper describes Ubiqus' submission to the WMT20 English-Inuktitut shared news translation task. Our main system, and only submission, is based on a multilingual approach, jointly training a Transformer model on several agglutinative languages. The English-Inuktitut translation task is challenging at every step, from data selection, preparation and tokenization to quality evaluation down the line. Difficulties emerge both because of the peculiarities of the Inuktitut language as well as the low-resource context.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Residual Connection · Dense Connections · Label Smoothing · Layer Normalization · Adam · Attention Is All You Need
