Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT
Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, Sairam, Sundaresan

TL;DR
Sensi-BERT introduces a sensitivity-driven fine-tuning method that efficiently reduces BERT model size for resource-limited devices while maintaining or improving task performance.
Contribution
It proposes a novel sensitivity analysis approach to selectively trim BERT parameters during fine-tuning, enhancing parameter efficiency without heavy additional compute.
Findings
Outperforms existing methods on multiple NLP tasks
Achieves higher accuracy with fewer parameters
Maintains performance with significant model size reduction
Abstract
Large pre-trained language models have recently gained significant traction due to their improved performance on various down-stream tasks like text classification and question answering, requiring only few epochs of fine-tuning. However, their large model sizes often prohibit their applications on resource-constrained edge devices. Existing solutions of yielding parameter-efficient BERT models largely rely on compute-exhaustive training and fine-tuning. Moreover, they often rely on additional compute heavy models to mitigate the performance gap. In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks. In particular, we perform sensitivity analysis to rank each individual parameter tensor, that then is used to trim them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Linear Warmup With Linear Decay · Linear Layer · Softmax · Dense Connections · Weight Decay · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece
