Robust Weight Initialization for Tanh Neural Networks with Fixed Point   Analysis

Hyunwoo Lee; Hayoung Choi; Hyunju Kim

arXiv:2410.02242·cs.LG·March 4, 2025

Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis

Hyunwoo Lee, Hayoung Choi, Hyunju Kim

PDF

Open Access

TL;DR

This paper introduces a new weight initialization technique for tanh neural networks based on fixed point analysis, improving training robustness, convergence speed, and data efficiency over traditional methods.

Contribution

It proposes a novel initialization method derived from fixed point analysis of tanh, addressing activation saturation issues and outperforming Xavier initialization.

Findings

01

Outperforms Xavier initialization in robustness and convergence speed

02

Effective across various network sizes and datasets

03

Enhances data efficiency in training deep tanh networks

Abstract

As a neural network's depth increases, it can improve generalization performance. However, training deep networks is challenging due to gradient and signal propagation issues. To address these challenges, extensive theoretical research and various methods have been introduced. Despite these advances, effective weight initialization methods for tanh neural networks remain insufficiently investigated. This paper presents a novel weight initialization method for neural networks with tanh activation function. Based on an analysis of the fixed points of the function $tanh (a x)$ , the proposed method aims to determine values of $a$ that mitigate activation saturation. A series of experiments on various classification datasets and physics-informed neural networks demonstrates that the proposed method outperforms Xavier initialization methods~(with or without normalization) in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsTanh Activation · Xavier Initialization