Multi-layer Perceptron Trainability Explained via Variability

Yueyao Yu; Yin Zhang

arXiv:2105.08911·cs.LG·May 19, 2023·1 cites

Multi-layer Perceptron Trainability Explained via Variability

Yueyao Yu, Yin Zhang

PDF

Open Access

TL;DR

This paper introduces the concept of variability to explain MLP trainability, showing its correlation with network depth, activation functions, and trainability, supported by empirical experiments.

Contribution

The study proposes a new measure called variability to better understand factors affecting deep neural network trainability, especially in MLPs.

Findings

01

Variability correlates positively with number of activations.

02

Variability negatively correlates with collapse to constant.

03

Absolute value activation yields higher variability than ReLU.

Abstract

Despite the tremendous successes of deep neural networks (DNNs) in various applications, many fundamental aspects of deep learning remain incompletely understood, including DNN trainability. In a trainability study, one aims to discern what makes one DNN model easier to train than another under comparable conditions. In particular, our study focuses on multi-layer perceptron (MLP) models equipped with the same number of parameters. We introduce a new notion called variability to help explain the benefits of deep learning and the difficulties in training very deep MLPs. Simply put, variability of a neural network represents the richness of landscape patterns in the data space with respect to well-scaled random weights. We empirically show that variability is positively correlated to the number of activations and negatively correlated to a phenomenon called "Collapse to Constant", which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications