MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
Shu Yang, Muhammad Asif Ali, Cheng-Long Wang, Lijie Hu, and Di Wang

TL;DR
MoRAL introduces a novel approach combining Mixture-of-Experts and Low-Rank Adaptation to enable large language models to learn continuously from question-answer pairs, improving efficiency, robustness, and knowledge retention.
Contribution
The paper proposes MoRAL, a new method integrating MoE and LoRA for lifelong learning of LLMs using simple QA pairs, along with a new benchmark and evaluation metrics.
Findings
LLMs learn faster in open-book settings with up to 30.15% improvement.
MoRAL performs better with larger models.
MoRAL demonstrates robustness against catastrophic forgetting.
Abstract
Adapting large language models (LLMs) to new domains/tasks and enabling them to be efficient lifelong learners is a pivotal challenge. In this paper, we propose MoRAL, i.e., Mixture-of-Experts augmented Low-Rank Adaptation for Lifelong Learning. MoRAL combines the multi-tasking abilities of MoE with the fine-tuning abilities of LoRA for effective life-long learning of LLMs. In contrast to the conventional approaches that use factual triplets as inputs MoRAL relies on simple question-answer pairs, which is a more practical and effective strategy for robust and efficient learning. Owing to new data settings, we introduce a new evaluation benchmark namely: Life Long Learning of LLM (5L-bench) encompassing a newly curated dataset of question-answer pairs, and a set of evaluation metrics for rigorous evaluation of MoRAL in open-book and closed-book settings. Experimental evaluation shows (i)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Semantic Web and Ontologies · Advanced Data Processing Techniques
MethodsSparse Evolutionary Training
