Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning
Maximilian Mozes, Tolga Bolukbasi, Ann Yuan, Frederick Liu, Nithum, Thain, Lucas Dixon

TL;DR
This paper introduces G-BAIR, a gradient-based iterative recovery method that leverages influence techniques like TracIn to improve parameter-efficient tuning of large language models, especially for data cleaning and debugging.
Contribution
The paper proposes G-BAIR, a novel gradient-based method that uses influence scores to automatically recover model performance after label corruption in PET settings.
Findings
G-BAIR successfully recovers LLM performance on corrupted benchmarks.
Influence methods can automate data cleaning in transfer learning.
G-BAIR enables interactive debugging and relabeling for PET models.
Abstract
Pretrained large language models (LLMs) are able to solve a wide variety of tasks through transfer learning. Various explainability methods have been developed to investigate their decision making process. TracIn (Pruthi et al., 2020) is one such gradient-based method which explains model inferences based on the influence of training examples. In this paper, we explore the use of TracIn to improve model performance in the parameter-efficient tuning (PET) setting. We develop conversational safety classifiers via the prompt-tuning PET method and show how the unique characteristics of the PET regime enable TracIn to identify the cause for certain misclassifications by LLMs. We develop a new methodology for using gradient-based explainability techniques to improve model performance, G-BAIR: gradient-based automated iterative recovery. We show that G-BAIR can recover LLM performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
