Every Model Learned by Gradient Descent Is Approximately a Kernel   Machine

Pedro Domingos

arXiv:2012.00152·cs.LG·December 2, 2020·49 cites

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

Pedro Domingos

PDF

Open Access 2 Videos

TL;DR

This paper demonstrates that deep neural networks trained with gradient descent are approximately equivalent to kernel machines, providing new insights into their interpretability and potential for developing improved learning algorithms.

Contribution

It reveals that standard deep networks are mathematically similar to kernel machines, linking deep learning to classical kernel methods and enhancing interpretability.

Findings

01

Deep networks approximate kernel machines when trained with gradient descent.

02

Network weights can be viewed as a superposition of training examples.

03

Incorporating target knowledge into the kernel improves learning.

Abstract

Deep learning's successes are often attributed to its ability to automatically discover new representations of the data, rather than relying on handcrafted features like other learning methods. We show, however, that deep networks learned by the standard gradient descent algorithm are in fact mathematically approximately equivalent to kernel machines, a learning method that simply memorizes the data and uses it directly for prediction via a similarity function (the kernel). This greatly enhances the interpretability of deep network weights, by elucidating that they are effectively a superposition of the training examples. The network architecture incorporates knowledge of the target function into the kernel. This improved understanding should lead to better learning algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Deep Networks Are Kernel Machines (Paper Explained)· youtube

The Professor Who Fought Back Against Cancel Culture in AI - Pedro Domingos· youtube

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis

MethodsInterpretability