Population Gradients improve performance across data-sets and architectures in object classification
Yurika Sakai, Andrey Kormilitzin, Qiang Liu, Alejo Nevado-Holgado

TL;DR
Population Gradients (PG) is a novel training method that uses a population of neural networks to estimate more accurate gradients, significantly boosting performance across various architectures, datasets, and training conditions.
Contribution
The paper introduces Population Gradients, a new approach that improves neural network training by providing more accurate gradient estimates using a population-based method.
Findings
Significant performance improvements across multiple datasets and architectures.
Effective when combined with standard training techniques.
Performance gains comparable or superior to existing methods.
Abstract
The most successful methods such as ReLU transfer functions, batch normalization, Xavier initialization, dropout, learning rate decay, or dynamic optimizers, have become standards in the field due, particularly, to their ability to increase the performance of Neural Networks (NNs) significantly and in almost all situations. Here we present a new method to calculate the gradients while training NNs, and show that it significantly improves final performance across architectures, data-sets, hyper-parameter values, training length, and model sizes, including when it is being combined with other common performance-improving methods (such as the ones mentioned above). Besides being effective in the wide array situations that we have tested, the increase in performance (e.g. F1) it provides is as high or higher than this one of all the other widespread performance-improving methods that we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia?
