Task-Aware Multi-Expert Architecture For Lifelong Deep Learning
Jianyu Wang, Jacob Nean-Hua Sheikh, Cat P. Le, and Hoda Bidkhori

TL;DR
The paper introduces TAME, a task-aware multi-expert architecture for lifelong deep learning that dynamically selects experts based on task similarity, uses replay buffers and attention to mitigate forgetting, and demonstrates improved performance on sequential classification tasks.
Contribution
TAME is a novel continual learning algorithm that leverages task similarity for expert selection and incorporates replay and attention mechanisms to enhance knowledge retention.
Findings
TAME improves accuracy on new tasks compared to baseline methods.
TAME maintains performance on earlier tasks while learning new ones.
Experiments on CIFAR-100 derived tasks validate its effectiveness.
Abstract
Lifelong deep learning (LDL) trains neural networks to learn sequentially across tasks while preserving prior knowledge. We propose Task-Aware Multi-Expert (TAME), a continual learning algorithm that leverages task similarity to guide expert selection and knowledge transfer. TAME maintains a pool of pretrained neural networks and activates the most relevant expert for each new task. A shared dense layer integrates features from the chosen expert to generate predictions. To reduce catastrophic forgetting, TAME uses a replay buffer that stores representative samples and embeddings from previous tasks and reuses them during training. An attention mechanism further prioritizes the most relevant stored information for each prediction. Together, these components allow TAME to adapt flexibly while retaining important knowledge across evolving task sequences. Experiments on binary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
