LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions
Hadi Askari, Shivanshu Gupta, Fei Wang, Anshuman Chhabra, Muhao Chen

TL;DR
LayerIF introduces a data-driven influence function approach to estimate layer-wise training quality in large language models, enabling better task-specific layer importance assessment and improving downstream task performance.
Contribution
The paper presents LayerIF, a novel influence function-based framework for quantifying layer importance in LLMs, accounting for both architecture and training data, which is a significant advancement over existing heuristics.
Findings
LayerIF provides accurate layer importance estimates for different tasks.
Using LayerIF improves task performance through better layer allocation.
The method is model-agnostic and effective across multiple LLM architectures.
Abstract
Pretrained Large Language Models (LLMs) achieve strong performance across a wide range of tasks, yet exhibit substantial variability in the various layers' training quality with respect to specific downstream applications, limiting their downstream performance. It is therefore critical to estimate layer-wise training quality in a manner that accounts for both model architecture and training data. However, existing approaches predominantly rely on model-centric heuristics (such as spectral statistics, outlier detection, or uniform allocation) while overlooking the influence of data. To address these limitations, we propose LayerIF, a data-driven framework that leverages Influence Functions to quantify the training quality of individual layers in a principled and task-sensitive manner. By isolating each layer's gradients and measuring the sensitivity of the validation loss to training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques
