TSMind: Alibaba and Soochow University's Submission to the WMT22   Translation Suggestion Task

Xin Ge; Ke Wang; Jiayi Wang; Nini Xiao; Xiangyu Duan; Yu Zhao; Yuqi; Zhang

arXiv:2211.08987·cs.CL·November 17, 2022·1 cites

TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task

Xin Ge, Ke Wang, Jiayi Wang, Nini Xiao, Xiangyu Duan, Yu Zhao, Yuqi, Zhang

PDF

Open Access

TL;DR

This paper presents TSMind, a translation suggestion system by Alibaba and Soochow University, which leverages fine-tuning of large pre-trained models and data augmentation techniques, achieving top rankings in the WMT22 shared task.

Contribution

It introduces a novel data filtering approach using dual conditional cross-entropy and GPT-2 models to enhance data augmentation for translation suggestion.

Findings

01

Ranked first in three out of four language directions

02

Effective use of data filtering improves translation suggestion quality

03

Demonstrates success of fine-tuning large pre-trained models for TS tasks

Abstract

This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English-German news translation system and MBART50 for English-Chinese as our pre-trained models. Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance. The difference is that we further involve the dual conditional cross-entropy model and GPT-2 language model to filter augmented data. The leader board finally shows that our submissions are ranked first in three of four language directions in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Layer · Dropout · Byte Pair Encoding · Attention Dropout · Linear Warmup With Cosine Annealing · Dense Connections · Layer Normalization