SWSC: Shared Weight for Similar Channel in LLM
Binrui Zeng, Yongtao Tang, Xiaodong Liu, Xiaopeng Li

TL;DR
SWSC is a novel LLM compression technique that clusters similar weights to reduce parameters and uses SVD-based compensation to maintain performance, enabling efficient deployment of large models.
Contribution
The paper introduces SWSC, a new compression method combining clustering and SVD-based correction to effectively reduce LLM size while preserving accuracy.
Findings
Significant parameter reduction achieved with minimal performance loss.
Effective in low-precision settings for LLMs.
Outperforms existing compression methods in experiments.
Abstract
Large language models (LLMs) have spurred development in multiple industries. However, the growing number of their parameters brings substantial storage and computing burdens, making it essential to explore model compression techniques for parameter reduction and easier deployment. We propose SWSC, an LLM compression method based on the concept of Shared Weight for Similar Channel. It uses the K-Means clustering algorithm to cluster model weights channel-by-channel, generating clusters with highly similar vectors within each. A representative vector from each cluster is selected to approximately replace all vectors in the cluster, significantly reducing the number of model weight parameters. However, approximate restoration will inevitably cause damage to the performance of the model. To tackle this issue, we perform singular value decomposition on the weight error values before and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodsk-Means Clustering
