Neural Networks for Learnable and Scalable Influence Estimation of Instruction Fine-Tuning Data
Ishika Agarwal, Dilek Hakkani-T\"ur

TL;DR
This paper introduces NN-CIFT, a neural network-based method that significantly reduces the computational cost of influence estimation in language models, enabling scalable and effective subset selection for instruction fine-tuning.
Contribution
We propose a small neural network approach, NN-CIFT, that estimates influence values with up to 99% cost reduction, maintaining accuracy across large models and datasets.
Findings
Achieves up to 99% reduction in influence estimation cost.
Maintains performance comparable to traditional influence functions.
Works effectively with models as small as 0.0027% of full size.
Abstract
Influence functions provide crucial insights into model training, but existing methods suffer from large computational costs and limited generalization. Particularly, recent works have proposed various metrics and algorithms to calculate the influence of data using language models, which do not scale well with large models and datasets. This is because of the expensive forward and backward passes required for computation, substantial memory requirements to store large models, and poor generalization of influence estimates to new data. In this paper, we explore the use of small neural networks -- which we refer to as the InfluenceNetwork -- to estimate influence values, achieving up to 99% cost reduction. Our evaluation demonstrates that influence values can be estimated with models just 0.0027% the size of full language models (we use 7B and 8B versions). We apply our algorithm of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Advanced Sensor and Control Systems
