Linear discriminant initialization for feed-forward neural networks

Marissa Masden; Dev Sinha

arXiv:2007.12782·cs.LG·August 19, 2020·5 cites

Linear discriminant initialization for feed-forward neural networks

Marissa Masden, Dev Sinha

PDF

Open Access

TL;DR

This paper introduces a novel initialization method for neural networks using linear discriminants, leading to faster training and higher accuracy, based on geometric insights into network structure.

Contribution

The paper proposes a linear discriminant-based initialization for the first layer of neural networks, improving training efficiency and accuracy.

Findings

01

Fewer training steps needed for convergence.

02

Higher asymptotic training accuracy.

03

Effective across different network architectures.

Abstract

Informed by the basic geometry underlying feed forward neural networks, we initialize the weights of the first layer of a neural network using the linear discriminants which best distinguish individual classes. Networks initialized in this way take fewer training steps to reach the same level of training, and asymptotically have higher accuracy on training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Face and Expression Recognition · Anomaly Detection Techniques and Applications