Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher, Jan Cegin, Robert Belanec, Jakub Simko, Ivan Srba,, Maria Bielikova

TL;DR
This paper introduces DENI, a novel, computationally efficient mitigation strategy for fine-tuning language models that reduces performance instability caused by randomness, outperforming existing methods and benefiting from data augmentation.
Contribution
The paper proposes DENI, a new mitigation approach combining ensembling, noise regularisation, and interpolation, which is more efficient and effective than existing strategies.
Findings
DENI outperforms the best existing mitigation strategy with less computational cost.
Mitigation strategies, including DENI, improve parameter-efficient fine-tuning performance.
Combining DENI with data augmentation enhances instability mitigation.
Abstract
While fine-tuning of pre-trained language models generally helps to overcome the lack of labelled training samples, it also displays model performance instability. This instability mainly originates from randomness in initialisation or data shuffling. To address this, researchers either modify the training process or augment the available samples, which typically results in increased computational costs. We propose a new mitigation strategy, called Delayed Ensemble with Noisy Interpolation (DENI), that leverages the strengths of ensembling, noise regularisation and model interpolation, while retaining computational efficiency. We compare DENI with 9 representative mitigation strategies across 3 models, 4 tuning strategies and 7 text classification datasets. We show that: 1) DENI outperforms the best performing mitigation strategy (Ensemble), while using only a fraction of its cost; 2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Speech and Audio Processing · Music and Audio Processing
