Variations on the Chebyshev-Lagrange Activation Function
Yuchen Li, Frank Rudzicz, Jekaterina Novikova

TL;DR
This paper introduces a novel class of parameterized polynomial activation functions based on Chebyshev-Lagrange interpolation, improving neural network data efficiency and performance on various tasks.
Contribution
It presents new implementations of piece-wise polynomial activations with Chebyshev nodes and demonstrates their effectiveness in classification and synthetic datasets.
Findings
Significant improvements in interpolation accuracy and capacity.
Competitive performance on MNIST and CIFAR-10.
Enhanced data efficiency in neural networks.
Abstract
We seek to improve the data efficiency of neural networks and present novel implementations of parameterized piece-wise polynomial activation functions. The parameters are the y-coordinates of n+1 Chebyshev nodes per hidden unit and Lagrangian interpolation between the nodes produces the polynomial on [-1, 1]. We show results for different methods of handling inputs outside [-1, 1] on synthetic datasets, finding significant improvements in capacity of expression and accuracy of interpolation in models that compute some form of linear extrapolation from either ends. We demonstrate competitive or state-of-the-art performance on the classification of images (MNIST and CIFAR-10) and minimally-correlated vectors (DementiaBank) when we replace ReLU or tanh with linearly extrapolated Chebyshev-Lagrange activations in deep residual architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Statistical and numerical algorithms · Gaussian Processes and Bayesian Inference
Methods*Communicated@Fast*How Do I Communicate to Expedia?
