Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models

Hyeontaek Hwang; Nguyen Dinh Son; Daeyoung Kim

arXiv:2602.04509·cs.CL·May 22, 2026

Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models

Hyeontaek Hwang, Nguyen Dinh Son, Daeyoung Kim

PDF

TL;DR

Model-Dowser is a novel sparse fine-tuning method that preserves important parameters to prevent catastrophic forgetting in multimodal large language models, improving performance and scalability.

Contribution

It introduces a principled importance scoring mechanism for parameters, enabling effective, resource-efficient mitigation of forgetting during fine-tuning.

Findings

01

Outperforms prior methods in mitigating catastrophic forgetting

02

Effective on large-scale multimodal models like LLaVA and NVILA

03

Remains resource-efficient and scalable to multi-billion-parameter models

Abstract

Fine-tuning Multimodal Large Language Models (MLLMs) on task-specific data is an effective way to improve performance on downstream applications. However, such adaptation often leads to a degradation in generalization on pretrained tasks, a phenomenon known as Catastrophic Forgetting. Existing methods that aim to mitigate this issue either become ineffective when fine-tuning deeper layers of the language decoder or scale poorly with increasing model size. To address these limitations, we propose Model-Dowser, a novel sparse fine-tuning approach for MLLMs. Model-Dowser measures a principled importance score for each model parameter with respect to pretrained generalization (prior to downstream adaptation) by jointly considering weight magnitudes, input activations, and output sensitivities. During fine-tuning, Model-Dowser selectively preserves high-importance parameters and updates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.