PPSEBM: An Energy-Based Model with Progressive Parameter Selection for Continual Learning
Xiaodi Li, Dingcheng Li, Rujun Gao, Mahmoud Zamani, Feng Mi, and Latifur Khan

TL;DR
PPSEBM introduces a novel continual learning framework combining energy-based models with progressive parameter selection, effectively mitigating catastrophic forgetting in NLP tasks by generating pseudo-samples and allocating task-specific parameters.
Contribution
The paper proposes PPSEBM, integrating an energy-based model with progressive parameter selection to improve knowledge retention in continual learning for NLP.
Findings
Outperforms state-of-the-art continual learning methods on NLP benchmarks.
Effectively generates pseudo-samples to retain past task knowledge.
Allocates distinct parameters for each task to prevent forgetting.
Abstract
Continual learning remains a fundamental challenge in machine learning, requiring models to learn from a stream of tasks without forgetting previously acquired knowledge. A major obstacle in this setting is catastrophic forgetting, where performance on earlier tasks degrades as new tasks are learned. In this paper, we introduce PPSEBM, a novel framework that integrates an Energy-Based Model (EBM) with Progressive Parameter Selection (PPS) to effectively address catastrophic forgetting in continual learning for natural language processing tasks. In PPSEBM, progressive parameter selection allocates distinct, task-specific parameters for each new task, while the EBM generates representative pseudo-samples from prior tasks. These generated samples actively inform and guide the parameter selection process, enhancing the model's ability to retain past knowledge while adapting to new tasks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
