Approximation with SiLU Networks: Constant Depth and Exponential Rates for Basic Operations
Koffi O. Ayena

TL;DR
This paper demonstrates how properly tuned SiLU neural networks can efficiently approximate functions like $x^2$ with constant depth and exponential rates, emphasizing the importance of hyperparameter optimization.
Contribution
It introduces SiLU network constructions that achieve approximation with constant depth and exponential rates by optimal hyperparameter tuning, extending to Sobolev spaces.
Findings
Two-layer networks approximate $x^2$ with error $oldsymbol{ ext{ε}}$ using weights scaling as $eta^{ ext{±}k}$.
Networks with depth $oldsymbol{ ext{O}(1)}$ and $oldsymbol{ ext{O}( ext{ε}^{-d/n})}$ parameters approximate Sobolev functions.
Hyperparameter tuning critically influences the approximation efficiency of SiLU networks.
Abstract
We present SiLU network constructions whose approximation efficiency depends critically on proper hyperparameter tuning. For the square function , with optimally chosen shift and scale , we achieve approximation error using a two-layer network of constant width, where weights scale as with . We then extend this approach through functional composition to Sobolev spaces, we obtain networks with depth and parameters under optimal hyperparameters settings. Our work highlights the trade-off between architectural depth and activation parameter optimization in neural network approximation theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications
