Efficient Learning With Sine-Activated Low-rank Matrices
Yiping Ji, Hemanth Saratchandran, Cameron Gordon, Zeyu Zhang, Simon, Lucey

TL;DR
This paper introduces a sine-activated low-rank matrix decomposition method that improves the accuracy of parameter-efficient neural network models without sacrificing their compactness, applicable across various architectures.
Contribution
The authors propose a novel sine-activated low-rank decomposition framework that enhances model performance while maintaining parameter efficiency, serving as a plug-in for existing low-rank models.
Findings
Improved accuracy in Vision Transformers, LLMs, NeRF, and 3D shape modeling.
Preserves parameter efficiency while increasing effective rank.
Demonstrated versatility across multiple neural network architectures.
Abstract
Low-rank decomposition has emerged as a vital tool for enhancing parameter efficiency in neural network architectures, gaining traction across diverse applications in machine learning. These techniques significantly lower the number of parameters, striking a balance between compactness and performance. However, a common challenge has been the compromise between parameter efficiency and the accuracy of the model, where reduced parameters often lead to diminished accuracy compared to their full-rank counterparts. In this work, we propose a novel theoretical framework that integrates a sinusoidal function within the low-rank decomposition process. This approach not only preserves the benefits of the parameter efficiency characteristic of low-rank methods but also increases the decomposition's rank, thereby enhancing model performance. Our method proves to be a plug in enhancement for…
Peer Reviews
Decision·ICLR 2025 Poster
+ The paper provides theoretical justification on benefit of the proposed sine-activated low rank matrices without adding additional parameters. + The experimental results show the proposed sine-activated low-rank decomposition works better than the original LoRA.
- The paper talks about the benefits of using a sine function for increasing rank, but it could explain more about why this function was chosen over others and why it works. A discussion of its advantages would strengthen this work. - While the proposed method is applied across various domains, the paper lacks a comprehensive comparison with existing low-rank approximation techniques. There are many existing LoRA variants and improved versions, e.g. [a] [b]. [a] LoLDU: Low-Rank Adaptation via
- The idea is simple and fascinating. If there are no flaws in the proof, which I didn't find, this has a potential large impact in the domain. - The mathematical background is well explained, motivated and proven. - Figures like Figure 2 help in understanding the impact of the method's parameter. - The method is shown on a variety of very different tasks from classification to Nerf. - Extreme scenarios are studied like Nerf reconstruction using Lora with rank k=1
- The method is only compared on self-implemented baselines and seems to underperform compared to other improvements in the domain. E.g. One of the citations even mentions DoRA but as an example of application of Lora techniques. They do have results of LLaMA3-8B LoRA which are similar to the results achieved in this paper but with DoRA they improve the performance on commonsense reasoning beyond this paper. When presenting an improved LoRA training technique the authors should compare their per
Task Motivation: The paper tackles a highly relevant problem—parameter-efficient learning. While overparameterization is effective for generalization, practical deployment in industry necessitates efficient architectures for cost-effectiveness. Simple Yet Effective Approach: The introduction of a sine function to augment low-rank decompositions is straightforward but impactful. As shown in Figure 3, the Sine LR method maintains representational power comparable to other activation functions whi
Lack of Rationale Behind the Sine Function's Effectiveness: While the theoretical and experimental validations in Sections 3.2 and 4 establish the efficacy of the sine activation, the paper doesn't delve into why the sine function specifically enhances parameter-efficient training. I'm left wondering what inherent properties of the sine function drive these results. A clearer explanation of how the sine function contributes to increasing matrix rank or improving performance without adding parame
Videos
Taxonomy
TopicsNeural Networks and Applications · Blind Source Separation Techniques · Sensor Technology and Measurement Systems
