Training Feedforward Neural Networks with Standard Logistic Activations   is Feasible

Emanuele Sansone; Francesco G.B. De Natale

arXiv:1710.01013·cs.NE·October 4, 2017·2 cites

Training Feedforward Neural Networks with Standard Logistic Activations is Feasible

Emanuele Sansone, Francesco G.B. De Natale

PDF

Open Access

TL;DR

This paper demonstrates that feedforward neural networks with standard logistic activations can be effectively trained to achieve competitive generalization performance by applying specific parameter initialization conditions derived from information theory.

Contribution

The work introduces a novel initialization method based on information-theoretic analysis, enabling successful training of networks with logistic activations.

Findings

01

Networks trained with the proposed initialization achieve comparable performance to hyperbolic tangent networks.

02

The initialization conditions are validated through extensive experiments.

03

Training logistic activation networks is feasible with proper parameter initialization.

Abstract

Training feedforward neural networks with standard logistic activations is considered difficult because of the intrinsic properties of these sigmoidal functions. This work aims at showing that these networks can be trained to achieve generalization performance comparable to those based on hyperbolic tangent activations. The solution consists on applying a set of conditions in parameter initialization, which have been derived from the study of the properties of a single neuron from an information-theoretic perspective. The proposed initialization is validated through an extensive experimental analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM