SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention   and Low-Rank Adaptation in Large Language Models

Yuxuan Zhang

arXiv:2502.18168·cs.CL·March 5, 2025

SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models

Yuxuan Zhang

PDF

Open Access

TL;DR

SECURA introduces a novel parameter-efficient fine-tuning method for large language models that reduces catastrophic forgetting and enhances performance through sigmoid-enhanced CUR decomposition and a new normalization technique.

Contribution

The paper proposes SECURA, a new PEFT approach combining Sigmoid-Enhanced CUR decomposition with a novel normalization to mitigate forgetting and improve fine-tuning in LLMs.

Findings

01

Achieves 3.59% average improvement on multiple-choice tasks.

02

Outperforms existing methods like DoRA and EWC in knowledge retention.

03

Maintains over 70% accuracy on basic knowledge tests.

Abstract

With the rapid development of large language models (LLMs), fully fine-tuning (FT) these models is becoming increasingly infeasible due to high computational demands. Moreover, FT also increases the risk of catastrophic forgetting. As an alternative, Low-Rank Adaptation (LoRA) has been proposed. By fine-tuning only a small subset of parameters, LoRA achieves performance similar to FT while significantly reducing resource requirements. However, since LoRA inherits FT's design, the issue of catastrophic forgetting still remains. To address these limitations, we propose SECURA: Sigmoid-Enhanced CUR Decomposition LoRA, a novel PEFT variant designed to mitigate catastrophic forgetting while improving fine-tuning performance. Our method introduces a novel normalization technique, Sigmoid-based Magnitude Norm (S-MagNorm), which enhances parameter retention and fine-tuning efficiency. SECURA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsExperience Replay · Elastic Weight Consolidation