On the Provable Generalization of Recurrent Neural Networks

Lifu Wang; Bo Shen; Bo Hu; Xing Cao

arXiv:2109.14142·cs.LG·January 27, 2022·1 cites

On the Provable Generalization of Recurrent Neural Networks

Lifu Wang, Bo Shen, Bo Hu, Xing Cao

PDF

Open Access 1 Video

TL;DR

This paper provides theoretical guarantees for the generalization of over-parameterized RNNs, demonstrating learnability of complex functions without normalization constraints and analyzing the impact of input sequence structure.

Contribution

It introduces new generalization bounds for RNNs trained with random initialization, extending learnability results to non-additive and more complex function classes.

Findings

01

Learnability of certain functions without normalized input conditions

02

Almost-polynomial scaling of iterations and samples with input length

03

Extension to non-additive functions of input sequences

Abstract

Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: 1) For a RNN with input sequence $x = (X_{1}, X_{2}, ..., X_{L})$ , previous works study to learn functions that are summation of $f (β_{l}^{T} X_{l})$ and require normalized conditions that $∣∣ X_{l} ∣∣ \leq ϵ$ with some very small $ϵ$ depending on the complexity of $f$ . In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the Provable Generalization of Recurrent Neural Networks· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Machine Learning and Algorithms