Continual Learning with Dirichlet Generative-based Rehearsal
Min Zeng, Wei Xue, Qifeng Liu, Yike Guo

TL;DR
This paper introduces Dirichlet Continual Learning (DCL), a novel generative rehearsal method using Dirichlet distributions to better model previous tasks, combined with Jensen-Shannon Knowledge Distillation to improve pseudo sample quality in dialogue systems.
Contribution
The paper proposes DCL, a new Dirichlet-based generative rehearsal strategy, and JSKD, a robust knowledge distillation method, to enhance continual learning in task-oriented dialogue systems.
Findings
DCL outperforms state-of-the-art methods in intent detection.
DCL effectively captures sentence-level features of previous tasks.
Jensen-Shannon Knowledge Distillation improves pseudo sample quality.
Abstract
Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues. Continual Learning (CL) attempts to solve this by avoiding intensive pre-training, but it faces the problem of catastrophic forgetting (CF). While generative-based rehearsal CL methods have made significant strides, generating pseudo samples that accurately reflect the underlying task-specific distribution is still a challenge. In this paper, we present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy for CL. Unlike the traditionally used Gaussian latent variable in the Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and versatility of the Dirichlet distribution to model the latent prior variable. This enables it to efficiently capture sentence-level features of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech Recognition and Synthesis
MethodsKnowledge Distillation
