Farkas layers: don't shift the data, fix the geometry
Aram-Alexandre Pooladian, Chris Finlay, Adam M Oberman

TL;DR
This paper introduces Farkas layers, a geometrically motivated method for training deep neural networks that guarantees neuron activation, improving training capacity without relying on batch normalization or specific weight initialization.
Contribution
The paper presents Farkas layers, a novel approach based on linear programming principles, to enhance neural network training by ensuring neuron activation without traditional normalization techniques.
Findings
Significant improvement in training capacity observed.
Effective across various network sizes and benchmark datasets.
Eliminates need for batch normalization or specialized initialization.
Abstract
Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on residual networks with ReLU activation, we empirically demonstrate a significant improvement in training capacity in the absence of batch normalization or methods of initialization across a broad range of network sizes on benchmark datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization
