Differential Equation Units: Learning Functional Forms of Activation   Functions from Data

MohamadAli Torkamani; Shiv Shankar; Amirmohammad Rooshenas; Phillip; Wallis

arXiv:1909.03069·cs.LG·September 10, 2019

Differential Equation Units: Learning Functional Forms of Activation Functions from Data

MohamadAli Torkamani, Shiv Shankar, Amirmohammad Rooshenas, Phillip, Wallis

PDF

1 Repo

TL;DR

This paper introduces differential equation units (DEUs) that allow neural network neurons to learn and adapt their activation functions during training, leading to more compact models with comparable or better performance.

Contribution

The paper presents a novel neuron design that learns activation functions from differential equations, enhancing network adaptability and efficiency.

Findings

01

DEUs enable neurons to learn nonlinear activation functions from data.

02

Networks with DEUs achieve comparable or superior performance with fewer parameters.

03

DEUs lead to more compact and adaptable neural network architectures.

Abstract

Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure. We introduce differential equation units (DEUs), an improvement to modern neural networks, which enables each neuron to learn a particular nonlinear activation function from a family of solutions to an ordinary differential equation. Specifically, each neuron may change its functional form during training based on the behavior of the other parts of the network. We show that using neurons with DEU activation functions results in a more compact network capable of achieving comparable, if not superior, performance when is compared to much larger networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rooshenas/deu
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.