Scalable Neural Network Kernels

Arijit Sehanobish; Krzysztof Choromanski; Yunfan Zhao; Avinava Dubey,; Valerii Likhosherstov

arXiv:2310.13225·cs.LG·March 7, 2024·2 cites

Scalable Neural Network Kernels

Arijit Sehanobish, Krzysztof Choromanski, Yunfan Zhao, Avinava Dubey,, Valerii Likhosherstov

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces scalable neural network kernels (SNNKs) that replace traditional layers for improved efficiency and expressiveness, enabling network compression and potential bypassing of backpropagation with theoretical and empirical validation.

Contribution

The paper proposes SNNKs as a novel layer replacement that enhances expressiveness and computational efficiency, along with a bundling process for neural network compression and explicit parameter formulas.

Findings

01

Up to 5x reduction in trainable parameters

02

Competitive accuracy with compressed models

03

Theoretical analysis of SNNKs and URFs

Abstract

We introduce the concept of scalable neural network kernels (SNNKs), the replacements of regular feedforward layers (FFLs), capable of approximating the latter, but with favorable computational properties. SNNKs effectively disentangle the inputs from the parameters of the neural network in the FFL, only to connect them in the final computation via the dot-product kernel. They are also strictly more expressive, as allowing to model complicated relationships beyond the functions of the dot-products of parameter-input vectors. We also introduce the neural network bundling process that applies SNNKs to compactify deep neural network architectures, resulting in additional compression gains. In its extreme version, it leads to the fully bundled network whose optimal parameters can be expressed via explicit formulae for several loss functions (e.g. mean squared error), opening a possibility…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 8· accept, good paperConfidence 3

Strengths

The paper introduces the concept of Scalable Neural Network Kernels (SNNKs), a fresh take on neural network architecture. This novel approach to approximating regular feedforward layers (FFLs) with computational benefits showcases a high degree of originality. The "neural network bundling process" and the notion of a fully bundled network present innovative methods for condensing deep neural network architectures. The "universal random features" mechanism, which aids in the formulation of variou

Weaknesses

The paper could benefit from a more direct comparison of SNNKs with other existing solutions or methods aimed at network compression or efficiency. Highlighting the unique advantages of SNNKs over these methods would further solidify its significance. The paper could delve deeper into the robustness of the SNNK approach. Are there scenarios where the approximation might break down? Understanding the edge cases and potential pitfalls would be crucial for practitioners looking to adopt this metho

Reviewer 02Rating 8· accept, good paperConfidence 3

Strengths

The paper introduces a new computational model, the scalable neural network kernels (SNNK), providing a novel approach to efficient neural network design, particularly for replacing feedforward layers in MLPs. The design of SNNKs ensures that inputs and parameters are disentangled, leading to efficient final computations via a dot-product kernel, which can greatly reduce computational overhead. The bundling process highlighted in the paper leads to the compactification of the neural network st

Weaknesses

The authors should provide some explanation or intuition why their model doesn’t work well in the some of the experiments they have performed. The analysis of how deep of a feed forward network can be approximated using the proposed method should be analyzed in further details. Can scalable neural network kernel be applied in any scenario or there are some specific scenarios when SKNN won’t work well. Authors should discuss about such datasets/models. If there is none, then authors should als

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

Here are some of the main strengths of this paper: - It makes an insightful connection between scalable kernel methods and neural network layers, introducing a novel perspective on feedforward layers. - The concept of SNNKs is very clearly presented along with detailed theoretical analysis and constructions. - The Fourier transform based universal random feature mechanism to instantiate SNNKs is interesting and useful. - SNNKs provably increase expressive power over standard layers, as shown

Weaknesses

Some potential weaknesses or limitations of this paper: - The focus is on feedforward fully-connected layers, not convolutional or recurrent layers commonly used in modern networks. - Experiments are limited to standard datasets and models; more complex domains like bioinformatics are not evaluated. - There is no investigation into how SNNKs affect representation learning or generalization. The emphasis is on compression. - Optimization and learning dynamics with SNNKs are not analyzed, apa

Code & Models

Repositories

arijitthegame/neural-network-kernels
pytorchOfficial

Videos

Scalable Neural Network Kernels· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Geophysical Methods and Applications · Machine Learning and ELM

MethodsAdapter