Fine-tuning can cripple your foundation model; preserving features may be the solution
Jishnu Mukhoti, Yarin Gal, Philip H.S. Torr, Puneet K. Dokania

TL;DR
This paper identifies that fine-tuning large models can cause loss of pre-trained knowledge, and introduces LDIFS, a method that preserves this knowledge while adapting to new tasks, demonstrated through extensive experiments.
Contribution
The paper proposes LDIFS, a novel fine-tuning approach that mitigates concept forgetting and enhances continual learning capabilities of foundation models.
Findings
LDIFS significantly reduces concept forgetting across 10 tasks.
LDIFS outperforms standard fine-tuning and continual learning baselines.
LDIFS effectively preserves pre-trained knowledge during sequential task learning.
Abstract
Pre-trained foundation models, due to their enormous capacity and exposure to vast amounts of data during pre-training, are known to have learned plenty of real-world concepts. An important step in making these pre-trained models effective on downstream tasks is to fine-tune them on related datasets. While various fine-tuning methods have been devised and have been shown to be highly effective, we observe that a fine-tuned model's ability to recognize concepts on tasks from the downstream one is reduced significantly compared to its pre-trained counterpart. This is an undesirable effect of fine-tuning as a substantial amount of resources was used to learn these pre-trained concepts in the first place. We call this phenomenon ''concept forgetting'' and via experiments show that most end-to-end fine-tuning approaches suffer heavily from this side effect. To this end,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBIM and Construction Integration · Power Systems and Technologies · Model-Driven Software Engineering Techniques
