FlexAct: Why Learn when you can Pick?

Ramnath Kumar; Kyle Ritscher; Junmin Judy; Lawrence Liu; Cho-Jui Hsieh

arXiv:2601.06441·cs.LG·January 13, 2026

FlexAct: Why Learn when you can Pick?

Ramnath Kumar, Kyle Ritscher, Junmin Judy, Lawrence Liu, Cho-Jui Hsieh

PDF

Open Access

TL;DR

FlexAct introduces a differentiable discrete selection mechanism for activation functions using Gumbel-Softmax, enabling neural networks to adaptively choose the most suitable activation during training, improving accuracy and flexibility.

Contribution

We propose a novel framework that allows neural networks to learn discrete activation functions dynamically, combining the Gumbel-Softmax trick with task-specific adaptation.

Findings

01

Consistently selects optimal activation functions on synthetic datasets

02

Enhances predictive accuracy and architectural flexibility

03

Bridges theoretical advances with practical utility

Abstract

Learning activation functions has emerged as a promising direction in deep learning, allowing networks to adapt activation mechanisms to task-specific demands. In this work, we introduce a novel framework that employs the Gumbel-Softmax trick to enable discrete yet differentiable selection among a predefined set of activation functions during training. Our method dynamically learns the optimal activation function independently of the input, thereby enhancing both predictive accuracy and architectural flexibility. Experiments on synthetic datasets show that our model consistently selects the most suitable activation function, underscoring its effectiveness. These results connect theoretical advances with practical utility, paving the way for more adaptive and modular neural architectures in complex learning scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques