Optimal Nonlinearities Improve Generalization Performance of Random   Features

Samet Demir; Zafer Do\u{g}an

arXiv:2309.16846·cs.LG·October 2, 2023

Optimal Nonlinearities Improve Generalization Performance of Random Features

Samet Demir, Zafer Do\u{g}an

PDF

Open Access

TL;DR

This paper introduces optimal nonlinear activation functions for random feature models, demonstrating improved generalization and mitigation of double descent phenomena across various tasks including CIFAR10.

Contribution

It identifies a set of optimal nonlinearities derived from the Gaussian model parameters, enhancing generalization beyond standard functions like ReLU.

Findings

01

Optimized nonlinearities outperform ReLU in generalization tasks.

02

The proposed functions mitigate the double descent phenomenon.

03

Experimental validation on synthetic and real data supports the theoretical claims.

Abstract

Random feature model with a nonlinear activation function has been shown to perform asymptotically equivalent to a Gaussian model in terms of training and generalization errors. Analysis of the equivalent model reveals an important yet not fully understood role played by the activation function. To address this issue, we study the "parameters" of the equivalent model to achieve improved generalization performance for a given supervised learning problem. We show that acquired parameters from the Gaussian model enable us to define a set of optimal nonlinearities. We provide two example classes from this set, e.g., second-order polynomial and piecewise linear functions. These functions are optimized to improve generalization performance regardless of the actual form. We experiment with regression and classification problems, including synthetic and real (e.g., CIFAR10) data. Our numerical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Fault Detection and Control Systems · Gaussian Processes and Bayesian Inference