An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks
Seyedeh Niusha Alavi Foumani, Ce Guo, Wayne Luk

TL;DR
This paper introduces a hardware-compatible neural network training algorithm based on ADMM and iterative least squares, achieving improved accuracy over SGD and Adam while being scalable and suitable for hardware implementation.
Contribution
The paper presents a novel ADMM-based training method for neural networks that avoids matrix inversion, enabling hardware scalability and parallelization.
Findings
Achieved 6.9% and 6.8% better accuracy than SGD and Adam on HIGGS dataset.
Achieved 21.0% and 2.2% better accuracy than SGD and Adam on IRIS dataset.
Maintained performance by replacing matrix inversion with iterative least squares.
Abstract
In this work, we present a hardware compatible neural network training algorithm in which we used alternating direction method of multipliers (ADMM) and iterative least-square methods. The motive behind this approach was to conduct a method of training neural networks that is scalable and can be parallelised. These characteristics make this algorithm suitable for hardware implementation. We have achieved 6.9\% and 6.8\% better accuracy comparing to SGD and Adam respectively, with a four-layer neural network with hidden size of 28 on HIGGS dataset. Likewise, we could observe 21.0\% and 2.2\% accuracy improvement comparing to SGD and Adam respectively, on IRIS dataset with a three-layer neural network with hidden size of 8. This is while the use of matrix inversion, which is challenging for hardware implementation, is avoided in this method. We assessed the impact of avoiding matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Face and Expression Recognition
MethodsAlternating Direction Method of Multipliers · Adam · Stochastic Gradient Descent
