Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan, Sicong Huang, Juhan Bae, Roger Grosse

TL;DR
This paper introduces LAFTR, an efficient parametric method to approximate the Function Space Distance between neural networks, enabling scalable analysis of model similarity, influence, and data quality in continual learning scenarios.
Contribution
The paper presents LAFTR, a novel linearized approximation technique for ReLU networks that requires minimal parameters and outperforms existing methods in estimating function space distance.
Findings
LAFTR outperforms other parametric approximations with less memory.
It is effective in continual learning and influence estimation.
It can detect mislabeled data without extensive dataset iteration.
Abstract
It is often useful to compactly summarize important properties of model parameters and training data so that they can be used later without storing and/or iterating over the entire dataset. As a specific case, we consider estimating the Function Space Distance (FSD) over a training set, i.e. the average discrepancy between the outputs of two neural networks. We propose a Linearized Activation Function TRick (LAFTR) and derive an efficient approximation to FSD for ReLU neural networks. The key idea is to approximate the architecture as a linear network with stochastic gating. Despite requiring only one parameter per unit of the network, our approach outcompetes other parametric approximations with larger memory requirements. Applied to continual learning, our parametric approximation is competitive with state-of-the-art nonparametric approximations, which require storing many training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
