Understanding Data Influence with Differential Approximation
Haoru Tan, Sitong Wu, Xiuzhe Wu, Wang Wang, Bo Zhao, Zeke Xie, Gui-Song Xia, and Xiaojuan Qi

TL;DR
This paper introduces Diff-In, a second-order influence approximation method that accurately assesses data influence during neural network training without assuming convexity, improving data analysis tasks.
Contribution
We propose Diff-In, a novel second-order influence approximation method that is accurate, scalable, and does not rely on convexity assumptions, outperforming existing estimators.
Findings
Diff-In achieves lower approximation error than existing methods.
Diff-In scales to millions of data points efficiently.
Diff-In outperforms baselines in data cleaning, deletion, and coreset selection.
Abstract
Data plays a pivotal role in the groundbreaking advancements in artificial intelligence. The quantitative analysis of data significantly contributes to model training, enhancing both the efficiency and quality of data utilization. However, existing data analysis tools often lag in accuracy. For instance, many of these tools even assume that the loss function of neural networks is convex. These limitations make it challenging to implement current methods effectively. In this paper, we introduce a new formulation to approximate a sample's influence by accumulating the differences in influence between consecutive learning steps, which we term Diff-In. Specifically, we formulate the sample-wise influence as the cumulative sum of its changes/differences across successive training iterations. By employing second-order approximations, we approximate these difference terms with high accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R · Statistics Education and Methodologies
