KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language   Models

Fei Yuan; Chang Ma; Shuai Yuan; Qiushi Sun; Lei Li

arXiv:2402.02801·cs.CL·June 4, 2024·1 cites

KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language Models

Fei Yuan, Chang Ma, Shuai Yuan, Qiushi Sun, Lei Li

PDF

Open Access

TL;DR

This paper introduces KS-Lottery, a method that identifies a small, effective subset of multilingual language model parameters for fine-tuning, achieving comparable performance to full fine-tuning with fewer parameters by using the Kolmogorov-Smirnov Test.

Contribution

KS-Lottery provides a theoretically grounded approach to find certified winning tickets in LLMs, enabling efficient fine-tuning with significantly fewer parameters.

Findings

01

KS-Lottery finds smaller parameter sets for fine-tuning.

02

Fine-tuning 18 tokens' embeddings suffices for translation tasks.

03

KS-Lottery achieves performance comparable to full fine-tuning.

Abstract

The lottery ticket hypothesis posits the existence of ``winning tickets'' within a randomly initialized neural network. Do winning tickets exist for LLMs in fine-tuning scenarios? How can we find such winning tickets? In this paper, we propose KS-Lottery, a method to identify a small subset of LLM parameters highly effective in multilingual fine-tuning. Our key idea is to use Kolmogorov-Smirnov Test to analyze the distribution shift of parameters before and after fine-tuning. We further theoretically prove that KS-Lottery can find the certified winning tickets in the embedding layer, fine-tuning on the found parameters is guaranteed to perform as well as full fine-tuning. Comparing KS-Lottery with other parameter-efficient tuning algorithms on translation tasks, the experimental results show that KS-Lottery finds a much smaller set of parameters for fine-tuning while achieving the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsSparse Evolutionary Training