Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning
Juntae Lee, Munawar Hayat, Sungrack Yun

TL;DR
This paper introduces a novel tripartite weight-space ensemble method for few-shot class incremental learning, effectively addressing catastrophic forgetting and overfitting by interpolating models and enhancing knowledge distillation.
Contribution
The proposed Tri-WE method innovatively interpolates base, previous, and current models in weight-space, improving continual learning with few examples and surpassing existing methods.
Findings
Achieves state-of-the-art results on miniImageNet, CUB200, and CIFAR100.
Effectively mitigates catastrophic forgetting and overfitting.
Enhances knowledge distillation with amplified data in few-shot scenarios.
Abstract
Few-shot class incremental learning (FSCIL) enables the continual learning of new concepts with only a few training examples. In FSCIL, the model undergoes substantial updates, making it prone to forgetting previous concepts and overfitting to the limited new examples. Most recent trend is typically to disentangle the learning of the representation from the classification head of the model. A well-generalized feature extractor on the base classes (many examples and many classes) is learned, and then fixed during incremental learning. Arguing that the fixed feature extractor restricts the model's adaptability to new classes, we introduce a novel FSCIL method to effectively address catastrophic forgetting and overfitting issues. Our method enables to seamlessly update the entire model with a few examples. We mainly propose a tripartite weight-space ensemble (Tri-WE). Tri-WE interpolates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
