Efficient Analysis of the Distilled Neural Tangent Kernel
Jamie Mahowald, Brian Bell, Alex Ho, Michael Geyer

TL;DR
This paper introduces the distilled neural tangent kernel (DNTK), a method that significantly reduces NTK computation costs by combining dataset distillation with projection techniques, maintaining performance and kernel structure.
Contribution
The paper presents DNTK, a novel approach that compresses data for NTK computation, achieving up to five orders of magnitude reduction while preserving accuracy.
Findings
Achieves 20-100× reduction in Jacobian calculations.
Per-class NTK matrices have low effective rank.
Reduces NTK computational complexity by up to five orders of magnitude.
Abstract
Neural tangent kernel (NTK) methods are computationally limited by the need to evaluate large Jacobians across many data points. Existing approaches reduce this cost primarily through projecting and sketching the Jacobian. We show that NTK computation can also be reduced by compressing the data dimension itself using NTK-tuned dataset distillation. We demonstrate that the neural tangent space spanned by the input data can be induced by dataset distillation, yielding a 20-100 reduction in required Jacobian calculations. We further show that per-class NTK matrices have low effective rank that is preserved by this reduction. Building on these insights, we propose the distilled neural tangent kernel (DNTK), which combines NTK-tuned dataset distillation with state-of-the-art projection methods to reduce up NTK computational complexity by up to five orders of magnitude while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications
