FinLoRA: Finetuning Quantized Financial Large Language Models Using Low-Rank Adaptation
Dannong Wang, Daniel Kim, Bo Jin, Xingjian Zhao, Tianfan Fu, Steve, Yang, and Xiao-Yang Liu

TL;DR
FinLoRA introduces a method combining quantization and low-rank adaptation to efficiently finetune financial large language models locally, reducing resource usage while maintaining high performance.
Contribution
The paper presents a novel approach using quantized low-rank adaptation (QLoRA) for scalable, resource-efficient finetuning of financial LLMs with minimal performance loss.
Findings
Significant reduction in GPU memory usage.
Improved finetuning speed and efficiency.
Maintained high accuracy on financial tasks.
Abstract
Finetuned large language models (LLMs) have shown remarkable performance in financial tasks, such as sentiment analysis and information retrieval. Due to privacy concerns, finetuning and deploying Financial LLMs (FinLLMs) locally are crucial for institutions. However, finetuning FinLLMs poses challenges including GPU memory constraints and long input sequences. In this paper, we employ quantized low-rank adaptation (QLoRA) to finetune FinLLMs, which leverage low-rank matrix decomposition and quantization techniques to significantly reduce computational requirements while maintaining high model performance. We also employ data and pipeline parallelism to enable local finetuning using cost-effective, widely accessible GPUs. Experiments on financial datasets demonstrate that our method achieves substantial improvements in accuracy, GPU memory usage, and time efficiency, underscoring the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods
