Hyperparameter-free Continuous Learning for Domain Classification in   Natural Language Understanding

Ting Hua; Yilin Shen; Changsheng Zhao; Yen-Chang Hsu; Hongxia Jin

arXiv:2201.01420·cs.CL·January 6, 2022

Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding

Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin

PDF

TL;DR

This paper introduces a hyperparameter-free continual learning model for domain classification in NLP that maintains high accuracy and stability without retraining on all old data, outperforming existing methods.

Contribution

The paper presents a novel hyperparameter-free continual learning approach using Fisher information and dynamical weight consolidation for stable, efficient domain classification in NLP.

Findings

01

Outperforms state-of-the-art by up to 20% in accuracy.

02

Maintains stable performance across various environments.

03

Effectively utilizes old data without extra hyperparameters.

Abstract

Domain classification is the fundamental task in natural language understanding (NLU), which often requires fast accommodation to new emerging domains. This constraint makes it impossible to retrain all previous domains, even if they are accessible to the new model. Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different. In fact, the key real-world problem is not the absence of old data, but the inefficiency to retrain the model with the whole old dataset. Is it potential to utilize some old data to yield high accuracy and maintain stable performance, while at the same time, without introducing extra hyperparameters? In this paper, we proposed a hyperparameter-free continual learning model for text data that can stably produce high performance under various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.