Comparison of non-linear activation functions for deep neural networks   on MNIST classification task

Dabal Pedamonti

arXiv:1804.02763·cs.LG·April 10, 2018·120 cites

Comparison of non-linear activation functions for deep neural networks on MNIST classification task

Dabal Pedamonti

PDF

Open Access 2 Repos

TL;DR

This paper compares various non-linear activation functions in deep neural networks on the MNIST classification task, analyzing their advantages, disadvantages, and the impact of weight initialization and network depth.

Contribution

It provides a comprehensive evaluation of different activation functions and the effects of weight initialization in deep neural networks for MNIST classification.

Findings

01

Activation functions significantly affect network performance.

02

Deeper networks improve classification accuracy.

03

Weight initialization impacts training stability.

Abstract

Activation functions play a key role in neural networks so it becomes fundamental to understand their advantages and disadvantages in order to achieve better performances. This paper will first introduce common types of non linear activation functions that are alternative to the well known sigmoid function and then evaluate their characteristics. Moreover deeper neural networks will be analysed because they positively influence the final performances compared to shallower networks. They also strictly depend on the weight initialisation hence the effect of drawing weights from Gaussian and uniform distribution will be analysed making particular attention on how the number of incoming and outgoing connection to a node influence the whole network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Blind Source Separation Techniques · Face and Expression Recognition