MoTE: Mixture of Task-specific Experts for Pre-Trained ModelBased Class-incremental Learning
Linjie Li, Zhenyu Wu, Yang Ji

TL;DR
This paper introduces MoTE, a novel framework for class-incremental learning with pre-trained models that uses task-specific experts and routing mechanisms to prevent forgetting and handle output dimension inconsistencies.
Contribution
The paper proposes MoTE, a mixture of task-specific experts framework that mitigates output miscalibration and avoids catastrophic forgetting in CIL with pre-trained models.
Findings
MoTE outperforms existing methods without requiring exemplars.
The number of tasks scales linearly with the number of adapters.
Adapter-Limited MoTE balances performance and model complexity.
Abstract
Class-incremental learning (CIL) requires deep learning models to continuously acquire new knowledge from streaming data while preserving previously learned information. Recently, CIL based on pre-trained models (PTMs) has achieved remarkable success. However, prompt-based approaches suffer from prompt overwriting, while adapter-based methods face challenges such as dimensional misalignment between tasks. While the idea of expert fusion in Mixture of Experts (MoE) can help address dimensional inconsistency, both expert and routing parameters are prone to being overwritten in dynamic environments, making MoE challenging to apply directly in CIL. To tackle these issues, we propose a mixture of task-specific experts (MoTE) framework that effectively mitigates the miscalibration caused by inconsistent output dimensions across tasks. Inspired by the weighted feature fusion and sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Pedagogy
