Learning Activation Functions to Improve Deep Neural Networks

Forest Agostinelli; Matthew Hoffman; Peter Sadowski; Pierre Baldi

arXiv:1412.6830·cs.NE·April 22, 2015·ICLR·349 cites

Learning Activation Functions to Improve Deep Neural Networks

Forest Agostinelli, Matthew Hoffman, Peter Sadowski, Pierre Baldi

PDF

Open Access 3 Repos

TL;DR

This paper introduces a learnable piecewise linear activation function for neural networks, optimized via gradient descent, leading to improved performance on image classification and physics benchmarks.

Contribution

It proposes a novel adaptive activation function that is learned independently for each neuron, enhancing deep neural network performance.

Findings

01

Achieved state-of-the-art results on CIFAR-10 and CIFAR-100 datasets.

02

Improved performance on a high-energy physics Higgs boson decay benchmark.

03

Demonstrated the effectiveness of learned activation functions over static ones.

Abstract

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Physics and Python Applications · Seismic Imaging and Inversion Techniques · Medical Imaging Techniques and Applications