Enhanced Recurrent Neural Tangent Kernels for Non-Time-Series Data

Sina Alemohammad; Randall Balestriero; Zichao Wang; Richard Baraniuk

arXiv:2012.04859·cs.LG·October 22, 2021

Enhanced Recurrent Neural Tangent Kernels for Non-Time-Series Data

Sina Alemohammad, Randall Balestriero, Zichao Wang, Richard Baraniuk

PDF

Open Access 2 Repos

TL;DR

This paper extends neural tangent kernels for complex RNN architectures and demonstrates their effectiveness on diverse non-time-series datasets, supported by a fast GPU implementation.

Contribution

It introduces new NTK formulations for advanced RNNs and shows their superior performance on non-time-series data.

Findings

01

RNN-based kernels outperform baselines on 90 non-time-series datasets

02

Developed a fast GPU implementation for these kernels

03

Extended NTK theory to complex RNN architectures

Abstract

Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex architectures including bidirectional RNNs and RNNs with average pooling. We also develop a fast GPU implementation to exploit the full practical potential of the kernels. Though RNNs are typically only applied to time-series data, we demonstrate that classifiers using RNN-based kernels outperform a range of baseline methods on 90 non-time-series datasets from the UCI data repository.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting