Continual Learning of Neural Machine Translation within Low Forgetting   Risk Regions

Shuhao Gu; Bojie Hu; Yang Feng

arXiv:2211.01542·cs.CL·November 7, 2022

Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions

Shuhao Gu, Bojie Hu, Yang Feng

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage training approach for neural machine translation that identifies low forgetting risk regions to effectively adapt to new tasks without catastrophic forgetting.

Contribution

It proposes a novel two-stage training method that searches for low forgetting risk regions based on loss curvature and parameter impact, improving continual learning performance.

Findings

01

Significant improvements over strong baselines in domain and language adaptation tasks.

02

Effective avoidance of catastrophic forgetting without access to previous training data.

03

Enhanced model retention and adaptation capabilities in continual learning scenarios.

Abstract

This paper considers continual learning of large-scale pretrained neural machine translation model without accessing the previous training data or introducing model separation. We argue that the widely used regularization-based methods, which perform multi-objective learning with an auxiliary loss, suffer from the misestimate problem and cannot always achieve a good balance between the previous and new tasks. To solve the problem, we propose a two-stage training method based on the local features of the real loss. We first search low forgetting risk regions, where the model can retain the performance on the previous task as the parameters are updated, to avoid the catastrophic forgetting problem. Then we can continually train the model within this region only with the new training data to fit the new task. Specifically, we propose two methods to search the low forgetting risk regions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictnlp/lfr-nmt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research