Scheduled Multi-task Learning for Neural Chat Translation
Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou

TL;DR
This paper introduces a three-stage scheduled multi-task learning framework for neural chat translation, effectively leveraging large-scale in-domain data and auxiliary tasks to improve translation quality across multiple language pairs.
Contribution
It proposes a novel three-stage training approach with strategic scheduling of auxiliary tasks, enhancing chat translation performance and addressing data scarcity issues.
Findings
Outperforms existing methods in four language directions
Effectively incorporates large-scale in-domain data
Demonstrates the importance of task scheduling in training
Abstract
Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
