A Study of the Mathematics of Deep Learning

Anirbit Mukherjee

arXiv:2104.14033·cs.LG·April 30, 2021

A Study of the Mathematics of Deep Learning

Anirbit Mukherjee

PDF

Open Access 1 Repo

TL;DR

This thesis advances the mathematical understanding of deep learning by establishing new theoretical results, algorithms, and bounds, thereby providing a rigorous foundation for neural network behavior and training methods.

Contribution

It introduces novel circuit complexity theorems, efficient training algorithms, convergence proofs for popular optimizers, and improved risk bounds for stochastic neural networks.

Findings

01

New circuit complexity theorems for neural functions

02

Linear-time training algorithm for ReLU gates

03

Convergence proofs for RMSProp and ADAM

Abstract

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormous amount of heuristics and it has turned out to be a serious mathematical challenge to be able to rigorously explain them. In this thesis, submitted to the Department of Applied Mathematics and Statistics, Johns Hopkins University we take several steps towards building strong theoretical foundations for these new paradigms of deep-learning. In chapter 2 we show new circuit complexity theorems for deep neural functions and prove classification theorems about these function spaces which in turn lead to exact algorithms for empirical risk minimization for depth 2 ReLU nets. We also motivate a measure of complexity of neural functions to constructively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

impredicative/irc-url-title-bot
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications

MethodsRMSProp