PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng, Zhaohui Wang, Muhan Zhang

TL;DR
PiSSA is a novel PEFT method for LLMs that initializes adaptation matrices with principal components, leading to faster convergence and better performance than LoRA across various models and tasks.
Contribution
PiSSA introduces principal component-based initialization for PEFT, improving convergence speed and accuracy over LoRA while maintaining compatibility with quantization.
Findings
PiSSA outperforms LoRA on multiple models and tasks.
PiSSA achieves higher accuracy on GSM8K benchmark.
PiSSA can be efficiently initialized with a fast SVD technique.
Abstract
To parameter-efficiently fine-tune (PEFT) large language models (LLMs), the low-rank adaptation (LoRA) method approximates the model changes through the product of two matrices and , where , is initialized with Gaussian noise, and with zeros. LoRA freezes the original model and updates the "Noise & Zero" adapter, which may lead to slow convergence. To overcome this limitation, we introduce Principal Singular values and Singular vectors Adaptation (PiSSA). PiSSA shares the same architecture as LoRA, but initializes the adaptor matrices and with the principal components of the original matrix , and put the remaining components into a residual matrix which is frozen during fine-tuning. Compared to LoRA, PiSSA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗sunatte/txt2sqlmodel
- 🤗MachoMaheen/devdock4bitmodel
- 🤗biluo/Bronya-Qwen2.5-14B-Instruct-Pissamodel· ♡ 1♡ 1
- 🤗budecosystem/Maxwell-TCS-v0.2model· 5 dl· ♡ 45 dl♡ 4
- 🤗sicer/arc-agi-legacymodel
- 🤗JilinHu/llemma_7b_3epoch_r32_e5_RQ1model· 1 dl1 dl
- 🤗Xin-Rui/LLAMA-Fac-NEW-A800model· ♡ 1♡ 1
- 🤗Linksome/lmfmodel
- 🤗Mickey25/liangyi_LLaMA_Factorymodel
- 🤗dongxx1104/llmmodel
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
