Inverse-Hessian Regularization for Continual Learning in ASR
Steven Vander Eeckt, Hugo Van hamme

TL;DR
This paper introduces Inverse Hessian Regularization (IHR), a novel, memory-free method for continual learning in ASR that leverages curvature information to mitigate catastrophic forgetting during model adaptation.
Contribution
The paper proposes IHR, which incorporates inverse Hessian approximations into model merging, improving continual learning performance without additional memory overhead.
Findings
IHR significantly reduces forgetting compared to baselines.
IHR improves adaptability in continual learning benchmarks.
Ablation studies confirm the effectiveness of curvature-based regularization.
Abstract
Catastrophic forgetting remains a major challenge for continual learning (CL) in automatic speech recognition (ASR), where models must adapt to new domains without losing performance on previously learned conditions. Several CL methods have been proposed for ASR, and, recently, weight averaging - where models are averaged in a merging step after fine-tuning - has proven effective as a simple memory-free strategy. However, it is heuristic in nature and ignores the underlying loss landscapes of the tasks, hindering adaptability. In this work, we propose Inverse Hessian Regularization (IHR), a memory-free approach for CL in ASR that incorporates curvature information into the merging step. After fine-tuning on a new task, the adaptation is adjusted through a Kronecker-factored inverse Hessian approximation of the previous task, ensuring that the model moves primarily in directions less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
