Geometric Understanding of Deep Learning
Na Lei, Zhongxuan Luo, Shing-Tung Yau, David Xianfeng Gu

TL;DR
This paper offers a geometric perspective on deep learning, emphasizing the importance of data manifolds and introducing concepts like rectified linear complexity to understand neural network capabilities.
Contribution
It introduces a geometric framework for deep learning, including manifold structures and complexity measures, and proposes using optimal mass transportation to manage distributions.
Findings
Deep learning learns data manifolds and their probability distributions.
There exist manifolds that certain neural networks cannot learn.
Optimal mass transportation can control latent space distributions.
Abstract
Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
[Quiz] Eigenfaces, Domain adaptation, Causality, Manifold Hypothesis, Denoising Autoencoder· youtube
Taxonomy
TopicsTopological and Geometric Data Analysis · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
