Understanding Two-Layer Neural Networks with Smooth Activation Functions

Changcun Huang

arXiv:2507.14177·cs.LG·July 22, 2025

Understanding Two-Layer Neural Networks with Smooth Activation Functions

Changcun Huang

PDF

TL;DR

This paper analyzes the training solutions of two-layer neural networks with smooth activation functions, revealing their approximation capabilities and underlying mechanisms through theoretical proofs and experiments.

Contribution

It introduces a novel framework for understanding two-layer networks with smooth activations, including new proofs and insights into their solution space.

Findings

01

Universal approximation property proved for arbitrary input dimensions.

02

Experimental verification supports theoretical insights.

03

Mechanisms involving Taylor expansions and smoothness principles are elucidated.

Abstract

This paper aims to understand the training solution, which is obtained by the back-propagation algorithm, of two-layer neural networks whose hidden layer is composed of the units with smooth activation functions, including the usual sigmoid type most commonly used before the advent of ReLUs. The mechanism contains four main principles: construction of Taylor series expansions, strict partial order of knots, smooth-spline implementation and smooth-continuity restriction. The universal approximation for arbitrary input dimensionality is proved and experimental verification is given, through which the mystery of ``black box'' of the solution space is largely revealed. The new proofs employed also enrich approximation theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.