Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models

Viktoriia Chekalina; Daniil Moskovskiy; Daria Cherniuk; Maxim Kurkin; Andrey Kuznetsov; Evgeny Frolov

arXiv:2505.17974·cs.LG·May 26, 2025

Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models

Viktoriia Chekalina, Daniil Moskovskiy, Daria Cherniuk, Maxim Kurkin, Andrey Kuznetsov, Evgeny Frolov

PDF

1 Models

TL;DR

This paper introduces GFWSVD, a scalable method for compressing large language models by approximating the Fisher information matrix more accurately, leading to better performance than existing diagonal-based methods.

Contribution

We propose GFWSVD, a novel Fisher-weighted SVD method that captures parameter correlations using a scalable Kronecker-factored approximation for improved model compression.

Findings

01

Outperforms diagonal Fisher approximation methods on LLM compression.

02

Achieves 5% better accuracy at 20x compression rate on MMLU.

03

Provides a scalable approach for Fisher information-based model compression.

Abstract

The Fisher information is a fundamental concept for characterizing the sensitivity of parameters in neural networks. However, leveraging the full observed Fisher information is too expensive for large models, so most methods rely on simple diagonal approximations. While efficient, this approach ignores parameter correlations, often resulting in reduced performance on downstream tasks. In this work, we mitigate these limitations and propose Generalized Fisher-Weighted SVD (GFWSVD), a post-training LLM compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix, providing a more accurate reflection of parameter importance. To make the method tractable, we introduce a scalable adaptation of the Kronecker-factored approximation algorithm for the observed Fisher information. We demonstrate the effectiveness of our method on LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Sayankotor/FastKronQuantization
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.