Scheduled Multi-task Learning for Neural Chat Translation

Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou

arXiv:2205.03766·cs.CL·May 11, 2022

Scheduled Multi-task Learning for Neural Chat Translation

Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a three-stage scheduled multi-task learning framework for neural chat translation, effectively leveraging large-scale in-domain data and auxiliary tasks to improve translation quality across multiple language pairs.

Contribution

It proposes a novel three-stage training approach with strategic scheduling of auxiliary tasks, enhancing chat translation performance and addressing data scarcity issues.

Findings

01

Outperforms existing methods in four language directions

02

Effectively incorporates large-scale in-domain data

03

Demonstrates the importance of task scheduling in training

Abstract

Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xl2248/sml
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis