A theoretical framework for deep locally connected ReLU network
Yuandong Tian

TL;DR
This paper introduces a new theoretical framework for deep locally connected ReLU networks that models data distribution and disentangled representations without unrealistic assumptions, aiding analysis of generalization and overfitting.
Contribution
It presents a novel, realistic theoretical framework based on teacher-student models for analyzing deep locally connected ReLU networks.
Findings
Framework models data distribution explicitly
Supports disentangled representations and regularization
Facilitates analysis of overfitting and generalization
Abstract
Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity. The framework explicitly formulates data distribution, favors disentangled representations and is compatible with common regularization techniques such as Batch Norm. The framework is built upon teacher-student setting, by expanding the student forward/backward propagation onto the teacher's computational graph. The resulting model does not impose unrealistic assumptions (e.g., Gaussian inputs, independence of activation, etc). Our framework could help facilitate theoretical analysis of many practical issues, e.g. overfitting, generalization, disentangled representations in deep networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Face and Expression Recognition · Machine Learning and ELM
Methods*Communicated@Fast*How Do I Communicate to Expedia?
