Tackling Distribution Shift in LLM via KILO: Knowledge-Instructed Learning for Continual Adaptation

Iing Muttakhiroh; Thomas Fevens

arXiv:2508.03571·cs.CL·August 6, 2025

Tackling Distribution Shift in LLM via KILO: Knowledge-Instructed Learning for Continual Adaptation

Iing Muttakhiroh, Thomas Fevens

PDF

TL;DR

KILO is a continual learning framework for LLMs that uses dynamic knowledge graphs and instruction tuning to improve adaptation to new domains while retaining previous knowledge, addressing domain shift issues.

Contribution

This work introduces KILO, a novel method combining knowledge retrieval and instruction tuning for continual adaptation of LLMs, outperforming existing baselines.

Findings

01

KILO outperforms strong baselines in multiple metrics.

02

It improves backward and forward transfer in domain adaptation.

03

KILO enhances training efficiency and knowledge retention.

Abstract

Large Language Models (LLMs) often suffer from performance degradation when faced with domain shifts, primarily due to catastrophic forgetting. In this work, we propose KILO (Knowledge-Instructed Learning for Continual Adaptation), a novel continual learning framework that integrates dynamic knowledge graphs with instruction tuning. By leveraging retrieved domain-specific knowledge as guidance during training, KILO enhances both adaptability to new domains and retention of previously acquired knowledge. We pretrain our model on WikiText-103 and evaluate sequential adaptation across four diverse target domains: BioASQ, SciQ, TweetEval, and MIND. Our experiments demonstrate that KILO consistently outperforms strong baselines, including continual fine-tuning, ERNIE 2.0, and CPT, in terms of backward transfer, forward transfer, F1 score, retention rate, and training efficiency. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.