TL;DR
This paper investigates how different fine-tuning tasks affect the retention and transferability of injected knowledge in large language models, highlighting the importance of task type and model size.
Contribution
It reveals that comprehension-focused tasks lead to better knowledge retention and transfer in LLMs compared to mapping tasks, across various architectures and scales.
Findings
Higher retention rates for comprehension tasks (48%) versus mapping tasks (17-20%)
Larger models show improved knowledge retention across all task types
Injected knowledge transfer diminishes in broader contexts, indicating limited semantic integration
Abstract
As the knowledge of large language models (LLMs) becomes outdated over time, there is a growing need for efficient methods to update them, especially when injecting proprietary information. Our study reveals that comprehension-intensive fine-tuning tasks (e.g., question answering and blanks) achieve substantially higher knowledge retention rates (48%) compared to mapping-oriented tasks like translation (17%) or text-to-JSON conversion (20%), despite exposure to identical factual content. We demonstrate that this pattern persists across model architectures and follows scaling laws, with larger models showing improved retention across all task types. However, all models exhibit significant performance drops when applying injected knowledge in broader contexts, suggesting limited semantic integration. These findings show the importance of task selection in updating LLM knowledge, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
