Continual Learning with Dirichlet Generative-based Rehearsal

Min Zeng; Wei Xue; Qifeng Liu; Yike Guo

arXiv:2309.06917·cs.CL·September 14, 2023·1 cites

Continual Learning with Dirichlet Generative-based Rehearsal

Min Zeng, Wei Xue, Qifeng Liu, Yike Guo

PDF

Open Access

TL;DR

This paper introduces Dirichlet Continual Learning (DCL), a novel generative rehearsal method using Dirichlet distributions to better model previous tasks, combined with Jensen-Shannon Knowledge Distillation to improve pseudo sample quality in dialogue systems.

Contribution

The paper proposes DCL, a new Dirichlet-based generative rehearsal strategy, and JSKD, a robust knowledge distillation method, to enhance continual learning in task-oriented dialogue systems.

Findings

01

DCL outperforms state-of-the-art methods in intent detection.

02

DCL effectively captures sentence-level features of previous tasks.

03

Jensen-Shannon Knowledge Distillation improves pseudo sample quality.

Abstract

Recent advancements in data-driven task-oriented dialogue systems (ToDs) struggle with incremental learning due to computational constraints and time-consuming issues. Continual Learning (CL) attempts to solve this by avoiding intensive pre-training, but it faces the problem of catastrophic forgetting (CF). While generative-based rehearsal CL methods have made significant strides, generating pseudo samples that accurately reflect the underlying task-specific distribution is still a challenge. In this paper, we present Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy for CL. Unlike the traditionally used Gaussian latent variable in the Conditional Variational Autoencoder (CVAE), DCL leverages the flexibility and versatility of the Dirichlet distribution to model the latent prior variable. This enables it to efficiently capture sentence-level features of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech Recognition and Synthesis

MethodsKnowledge Distillation