Farkas layers: don't shift the data, fix the geometry

Aram-Alexandre Pooladian; Chris Finlay; Adam M Oberman

arXiv:1910.02840·cs.LG·October 8, 2019

Farkas layers: don't shift the data, fix the geometry

Aram-Alexandre Pooladian, Chris Finlay, Adam M Oberman

PDF

Open Access 1 Repo

TL;DR

This paper introduces Farkas layers, a geometrically motivated method for training deep neural networks that guarantees neuron activation, improving training capacity without relying on batch normalization or specific weight initialization.

Contribution

The paper presents Farkas layers, a novel approach based on linear programming principles, to enhance neural network training by ensuring neuron activation without traditional normalization techniques.

Findings

01

Significant improvement in training capacity observed.

02

Effective across various network sizes and benchmark datasets.

03

Eliminates need for batch normalization or specialized initialization.

Abstract

Successfully training deep neural networks often requires either batch normalization, appropriate weight initialization, both of which come with their own challenges. We propose an alternative, geometrically motivated method for training. Using elementary results from linear programming, we introduce Farkas layers: a method that ensures at least one neuron is active at a given layer. Focusing on residual networks with ReLU activation, we empirically demonstrate a significant improvement in training capacity in the absence of batch normalization or methods of initialization across a broad range of network sizes on benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

APooladian/FarkasLayers
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization